The intersection axiom of - Universiteit Leidengill/Intersection.pdf · experiments, Grobner bases, real algebraic geometry, exponential families, exact test, maximum likelihood degree,

The intersection axiom of conditional independence:

some “new” resultsRichard D. Gill

Mathematical Institute, University LeidenThis version: 3 October, 2019

(X ⊥⊥ Y ∣ Z) & (X ⊥⊥ Z ∣ Y ) ⟹ X ⊥⊥ (Y, Z)

Algebraic Statistics seminar, Leiden, 27 February 2019; Combinatorics seminar 2019, SJTU, 2 October 2019

X is independent of Y given Z and X is independent of Z given Y, implies X is independent of Y and Z

The intersection axiom of conditional independence:

some “new” resultsRichard D. Gill

Mathematical Institute, University LeidenThis version: 2 October, 2019

(X ⊥⊥ Y ∣ Z) & (X ⊥⊥ Z ∣ Y ) ⟹ X ⊥⊥ (Y, Z)

Algebraic Statistics seminar, Leiden, 27 February 2019; Combinatorics seminar 2019, JTSU, 2 October 2019

Intersection axiom. Well known to be neither true nor even an axiom.

X is independent of Y given Z and X is independent of Z given Y, implies X is independent of Y and Z

Comfort zones All variables have:

• Finite outcome space [Nice for algebraic geometry]

• Countable outcome space

• Continuous joint density with respect to sigma-finite product measures [Usually not used rigorously]

• Outcome spaces are Polish ❤

Other “convenience” assumptions: Strictly positive joint density Multivariate normal also allows algebraic geometry approach

Algebraic Statistics

Seth Sullivant

North Carolina State University

E-mail address: [email protected]

2010 Mathematics Subject Classification. Primary 62-01, 14-01, 13P10,13P15, 14M12, 14M25, 14P10, 14T05, 52B20, 60J10, 62F03, 62H17,

90C10, 92D15

Key words and phrases. algebraic statistics, graphical models, contingencytables, conditional independence, phylogenetic models, design of

experiments, Grobner bases, real algebraic geometry, exponential families,exact test, maximum likelihood degree, Markov basis, disclosure limitation,

random graph models, model selection, identifiability

Abstract. Algebraic statistics uses tools from algebraic geometry, com-mutative algebra, combinatorics, and their computational sides to ad-dress problems in statistics and its applications. The starting pointfor this connection is the observation that many statistical models aresemialgebraic sets. The algebra/statistics connection is now over twentyyears old– this book presents the first comprehensive and introductorytreatment of the subject. After background material in probability, al-gebra, and statistics, the book covers a range of topics in algebraicstatistics including algebraic exponential families, likelihood inference,Fisher’s exact test, bounds on entries of contingency tables, design ofexperiments, identifiability of hidden variable models, phylogenetic mod-els, and model selection. The book is suitable for both classroom useand independent study, as it has numerous examples, references, andover 150 exercises.

Graduate Studies in MathematicsVolume: 194; 2018; 490 pp; HardcoverMSC: Primary 62; 14; 13; 52; 60; 90; 92;

Print ISBN: 978-1-4704-3517-2

Inspiration: study group on algebraic statistics

The (semi-)graphoid axioms of (conditional) independence

1. Symmetry

2. Decomposition

3. Weak union

4. Contraction

5. Intersection ( X ⊥⊥ Y ∣ Z & X ⊥⊥ Z ∣ Y ) ⟹ X ⊥⊥ (Y, Z)

X ⊥⊥ Y ⟹ Y ⊥⊥ X

X ⊥⊥ (Y, Z) ⟹ X ⊥⊥ Y

X ⊥⊥ (Y, Z) ⟹ X ⊥⊥ Y ∣ Z

( X ⊥⊥ Z ∣ Y & X ⊥⊥ Y ) ⟹ X ⊥⊥ (Y, Z)

1–5: (with further global conditioning): the graphoid axioms. Phil Dawid (1980). 1–4: ( … ): the semi-graphoid axioms So called because of similarity to *graph separation* for subgraphs of a simple undirected graph: A is separated from B by C

• The intersection axiom (nr 5):

• “New” result:

where W:= f(Y) = g(Z) for some f, g

• In particular, we can take W = Law((Y, Z) | X)

• If f and g are trivial (constant) we obtain “axiom 5”

• Also “new”: Nontrivial f, g exist such that f(Y) = g(Z) a.e. iff A, B exist with probabilities strictly between 0 and 1 s.t.

Call such a joint law decomposable

(X ⊥⊥ Y ∣ Z) & (X ⊥⊥ Z ∣ Y ) ⟺ X ⊥⊥ (Y, Z) ∣ W

Pr(Y ∈ A & Z ∈ Bc) = 0 = Pr(Y ∈ Ac & Z ∈ B)

(X ⊥⊥ Y ∣ Z) & (X ⊥⊥ Z ∣ Y ) ⟹ X ⊥⊥ (Y, Z)

• The intersection axiom:







(X ⊥⊥ Y ∣ Z) & (X ⊥⊥ Z ∣ Y ) ⟺ X ⊥⊥ (Y, Z) ∣ W

Pr(Y ∈ A & Z ∈ Bc) = 0 = Pr(Y ∈ Ac & Z ∈ B)

(X ⊥⊥ Y ∣ Z) & (X ⊥⊥ Z ∣ Y ) ⟹ X ⊥⊥ (Y, Z)








(X ⊥⊥ Y ∣ Z) & (X ⊥⊥ Z ∣ Y ) ⟺ X ⊥⊥ (Y, Z) ∣ W

Pr(Y ∈ A & Z ∈ Bc) = 0 = Pr(Y ∈ Ac & Z ∈ B)

(X ⊥⊥ Y ∣ Z) & (X ⊥⊥ Z ∣ Y ) ⟹ X ⊥⊥ (Y, Z)








(X ⊥⊥ Y ∣ Z) & (X ⊥⊥ Z ∣ Y ) ⟺ X ⊥⊥ (Y, Z) ∣ W

Pr(Y ∈ A & Z ∈ Bc) = 0 = Pr(Y ∈ Ac & Z ∈ B)

(X ⊥⊥ Y ∣ Z) & (X ⊥⊥ Z ∣ Y ) ⟹ X ⊥⊥ (Y, Z)

Construction of counter example

More elaborate counter example Leading to the general theorem

Proof of new rule Discrete case

Comfort zones• All variables have finite support (Algebraic Geometry)

• All variables have countable support

• All variables have continuous joint probability densities (many applied statisticians)

• All densities are strictly positive

• All distributions are non-degenerate Gaussian

• All variables take values in Polish spaces (My favourite)Polish space: a topological space which can be given a metric making it complete and separable

Please recall• The joint probability distribution of X and Y can be disintegrated into

the marginal distribution of X and a family of conditional distributions of Y given X = x

• The disintegration is unique up to almost everywhere equivalence

• Conditional independence of X and Y given Z is just ordinary independence within each of the joint laws of X and Y conditional on Z = z

• For me, 0/0 = “undefined” and 0 x “undefined” = 0 (probability times number)

• So: conditional distributions do exist if we condition on zero probability events; they’re just not uniquely defined.

• The non-uniqueness is harmless

Some new notation• I’ll denote by “law(X)” the probability distribution (law) of X, where X is a random variable

which takes values in a space 𝒳. So law(X) is a probability distribution on 𝒳

• In the finite, discrete case, a “law” is just a vector of real numbers, non-negative, adding to one.

• In the Polish case, the set of probability laws on a given Polish space is itself a Polish space under, e.g., the Wasserstein metric. Disintegrations exist, Everything is nice.

• The family of conditional distributions of X given Y, (law(X | Y = y))y ∈ 𝒴 can be thought of as a function of y ∈ 𝒴. In the Polish case, the function is Borel measurable.

• As a function of the random variable Y, we can consider it as a random variable, or as a random vector talking values in an affine space.

• By Law(X | Y) I’ll denote that random variable, taking values in the space of probability laws on 𝒳.

Note distinction: Law vs. law

Crucial lemma

X ⫫ Y | Law(X | Y)

Proof of lemma, discrete case

Recall, X ⫫ Y | Z ⟺ p(x, y, z) = g(x, z) h(y, z)

Thus X ⫫ Y | L ⟺ we can factor p(x, y, l) this way

Given function p(x, y), pick any

Lemma: X ⫫ Y | Law(X | Y)

x ∈ 𝒳, y ∈ 𝒴, ℓ ∈ Δ|𝒳|−1

p(x, y, ℓ) = p(x, y) ⋅ 1{ℓ = p( ⋅ , y)/p(y)}

= ℓ(x)p(y)1{ℓ = p( ⋅ , y)/p(y)}

= Eval(ℓ, x) . p(y)1{ℓ = p( ⋅ , y)/p(y)}Proof of lemma, Polish caseSimilar, but a tiny bit different – we don’t assume existence of joint densities!

𝝙d = probability simplex, dimension d capital L = Law(X | Y), a random probability measure

Small 𝓁 (“ell”) is a possible realisation

• X ⫫ Y | Z ⟹ Law(X | Y, Z) = Law(X | Z)

• X ⫫ Z | Y ⟹ Law(X | Y, Z) = Law(X | Y)

• So we have w(Y, Z) = g(Z) = f(Y) =: W for some functions w, g, f

• By our lemma, X ⫫ (Y, Z) | Law(X | (Y, Z))

• We found functions g, f such that g(Z) = f(Y) and, with W:= w(Y, Z) = g(Z) = f(Y), X ⫫ (Y, Z) | W

Proof of forwards implication

• Suppose X ⫫ (Y, Z) | W where W = g(Z) = f(Y) for some functions g, f

• By axiom 3, X ⫫ Y | (W, Z)

• So X ⫫ Y | (g(Z), Z)

• So X ⫫ Y | Z

• Similarly, X ⫫ Z | Y

Proof of reverse implication

Sullivant• Uses primary decomposition of toric ideals to come up with a nice

parametrisation of the model “Axiom 5”

• Given: finite sets 𝒳, 𝒴, 𝒵, what is the set of all probability measures on their product satisfying Axiom 5, and with p(y) > 0, p(z) > 0, for all y, z ?

• Answer: pick partitions of 𝒴, 𝒵 which are in 1-1 correspondence with one another. Call one of them “𝒲”. Pick a positive probability distribution on 𝒲. Pick indecomposable probability distributions on the products of corresponding partition elements of 𝒴 and 𝒵. Pick probability distributions on 𝒳, also corresponding to the preceding, not necessarily all different

• Now put them together: in simulation terms: generate r.v. W = w∊𝒲. Generate (Y,Z) given W =w and independently thereof generate X given W = w.

Polish spaces

• Exactly same construction … just replace “partition” by a Borel measurable map onto another Polish space

• “Corresponding partitions” … Borel measurable maps onto same Polish space

Questions

• Does algebraic geometry provide any further “statistical” insights?

• Can some of you join me to turn all these ideas into a nice joint paper?

• Could there be a category theoretical meta-theorem?

Sullivant, book, ch. 4, esp. section 4.3.1

ReferencesBibliography 463

[Fel81] Joseph Felsenstein, Evolutionary trees from dna sequences: a maximumlikelihood approach, Journal of Molecular Evolution 17 (1981), 368–376.

[Fel03] , Inferring Phylogenies, Sinauer Associates, Inc., Sunderland, 2003.

[FG12] Shmuel Friedland and Elizabeth Gross, A proof of the set-theoretic versionof the salmon conjecture, J. Algebra 356 (2012), 374–379. MR 2891138

[FHKT08] Conor Fahey, Serkan Hosten, Nathan Krieger, and Leslie Timpe, Leastsquares methods for equidistant tree reconstruction, arXiv:0808.3979, 2008.

[Fin10] Audrey Finkler, Goodness of fit statistics for sparse contingency tables,arXiv:1006.2620, 2010.

[Fin11] Alex Fink, The binomial ideal of the intersection axiom for conditionalprobabilities, J. Algebraic Combin. 33 (2011), no. 3, 455–463. MR 2772542

[Fis66] Franklin M. Fisher, The identification problem in econometrics, McGraw-Hill, 1966.

[FKS16] Stefan Forcey, Logan Keefe, and William Sands, Facets of the balancedminimal evolution polytope, J. Math. Biol. 73 (2016), no. 2, 447–468.MR 3521111

[FKS17] , Split-facets for balanced minimal evolution polytopes and thepermutoassociahedron, Bull. Math. Biol. 79 (2017), no. 5, 975–994.MR 3634386

[FPR11] Stephen E. Fienberg, Sonja Petrovic, and Alessandro Rinaldo, Algebraicstatistics for p1 random graph models: Markov bases and their uses, Look-ing back, Lect. Notes Stat. Proc., vol. 202, Springer, New York, 2011,pp. 21–38. MR 2856692

[FR07] Stephen E. Fienberg and Alessandro Rinaldo, Three centuries of categoricaldata analysis: log-linear models and maximum likelihood estimation, J.Statist. Plann. Inference 137 (2007), no. 11, 3430–3445. MR 2363267

[FS08] Stephen E. Fienberg and Aleksandra B. Slavkovic, A survey of statisti-cal approaches to preserving confidentiality of contingency table entries,Privacy-Preserving Data Mining: Models and Algorithms, Springer, 2008,pp. 291–312.

[Gal57] David Gale, A theorem on flows in networks, Pacific J. Math. 7 (1957),1073–1082. MR 0091855

[GBN94] Dan Gusfield, Krishnan Balasubramanian, and Dalit Naor, Parametric op-timization of sequence alignment, Algorithmica 12 (1994), no. 4-5, 312–326.MR 1289485

[GJdSW84] Robert Grone, Charles R. Johnson, Eduardo M. de Sa, and Henry Wolkow-icz, Positive definite completions of partial Hermitian matrices, Linear Al-gebra Appl. 58 (1984), 109–124. MR 739282

[GKZ94] Izrael Gelfand, Mikhail M. Kapranov, and Andrei V. Zelevinsky, Dis-criminants, resultants, and multidimensional determinants, Mathemat-ics: Theory & Applications, Birkhauser Boston, Inc., Boston, MA, 1994.MR 1264417 (95e:14045)

[GMS06] Daniel Geiger, Chris Meek, and Bernd Sturmfels, On the toric algebra ofgraphical models, Ann. Statist. 34 (2006), no. 3, 1463–1492.

[GPPS13] Luis Garcia-Puente, Sonja Petrovic, and Seth Sullivant, Graphical models,J. Softw. Algebra Geom. 5 (2013), 1–7.

Bibliography 469

R. Soc. Lond. Proc. Ser. A Math. Phys. Eng. Sci. 459 (2003), no. 2039,2821–2845.

[ND85] Peter G. Norton and Earl V. Dunn, Snoring as a risk factor for disease:an epidemiological survey., British Medical Journal (Clinical Research Ed.)291 (1985), 630–632.

[Ney71] Jerzy Neyman, Molecular studies of evolution: a source of novel statisticalproblems, Statistical decision theory and related topics, 1971, pp. 1–27.

[Nor98] J. R. Norris, Markov chains, Cambridge Series in Statistical and Proba-bilistic Mathematics, vol. 2, Cambridge University Press, Cambridge, 1998,Reprint of 1997 original. MR 1600720 (99c:60144)

[Ohs10] Hidefumi Ohsugi, Normality of cut polytopes of graphs in a minor closedproperty, Discrete Math. 310 (2010), no. 6-7, 1160–1166. MR 2579848(2011e:52022)

[OHT13] Mitsunori Ogawa, Hisayuki Hara, and Akimichi Takemura, Graver basis foran undirected graph and its application to testing the beta model of randomgraphs, Ann. Inst. Statist. Math. 65 (2013), no. 1, 191–212. MR 3011620

[OS99] Shmuel Onn and Bernd Sturmfels, Cutting corners, Adv. in Appl. Math.23 (1999), no. 1, 29–48. MR 1692984 (2000h:52019)

[Pau00] Yves Pauplin, Direct calculation of a tree length using a distance matrix,Journal of Molecular Evolution 51 (2000), 41–47.

[Pea82] Judea Pearl, Reverend bayes on inference engines: A distributed hierarchi-cal approach, Proceedings of the Second National Conference on ArtificialIntelligence. AAAI-82: Pittsburgh, PA., Menlo Park, California: AAAIPress., 1982, pp. 133–136.

[Pea09] , Causality, second ed., Cambridge University Press, Cambridge,2009, Models, reasoning, and inference. MR 2548166 (2010i:68148)

[Pet14] Jonas Peters, On the intersection property of conditional independence andits application to causal discovery, Journal of Causal Inference 3 (2014),97–108.

[PRF10] Sonja Petrovic, Alessandro Rinaldo, and Stephen E. Fienberg, Algebraicstatistics for a directed random graph model with reciprocation, Algebraicmethods in statistics and probability II, Contemp. Math., vol. 516, Amer.Math. Soc., Providence, RI, 2010, pp. 261–283. MR 2730754

[PRW01] Giovanni Pistone, Eva Riccomagno, and Henry P. Wynn, Algebraic Statis-tics, Monographs on Statistics and Applied Probability, vol. 89, Chapman& Hall/CRC, Boca Raton, FL, 2001.

[PS04a] Lior Pachter and Bernd Sturmfels, Parametric inference for biological se-quence analysis, Proc. Natl. Acad. Sci. USA 101 (2004), no. 46, 16138–16143. MR 2114587

[PS04b] , The tropical geometry of statistical models, Proc. Natl. Acad. Sci.USA 101 (2004), 16132–16137.

[PS05] L. Pachter and B. Sturmfels (eds.), Algebraic Statistics for ComputationalBiology, Cambridge University Press, New York, 2005.

[R C16] R Core Team, R: A language and environment for statistical computing, RFoundation for Statistical Computing, Vienna, Austria, 2016.

[Rai12] Claudiu Raicu, Secant varieties of Segre-Veronese varieties, Algebra Num-ber Theory 6 (2012), no. 8, 1817–1868. MR 3033528

Bibliography 461

[DMM10] Alicia Dickenstein, Laura Felicia Matusevich, and Ezra Miller, Combina-torics of binomial primary decomposition, Math. Z. 264 (2010), no. 4, 745–763. MR 2593293 (2011b:13063)

[Dob02] Adrian Dobra, Statistical tools for disclosure limitation in multi-way con-tingency tables, ProQuest LLC, Ann Arbor, MI, 2002, Thesis (Ph.D.)–Carnegie Mellon University. MR 2703763

[Dob03] Adrian Dobra, Markov bases for decomposable graphical models, Bernoulli9 (2003), no. 6, 1–16.

[DP17] Mathias Drton and Martyn Plummer, A Bayesian information criterionfor singular models, J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 (2017),no. 2, 323–380, With discussions. MR 3611750

[DR72] John N. Darroch and D. Ratcli↵, Generalized iterative scaling for log-linearmodels, Ann. Math. Statist. 43 (1972), 1470–1480.

[DR08] Mathias Drton and Thomas S. Richardson, Graphical methods for e�cientlikelihood inference in Gaussian covariance models, J. Mach. Learn. Res. 9(2008), 893–914. MR 2417257

[Drt09a] Mathias Drton, Discrete chain graph models, Bernoulli 15 (2009), 736–753.

[Drt09b] , Likelihood ratio tests and singularities, Ann. Statist. 37 (2009),no. 2, 979–1012. MR 2502658 (2010d:62143)

[DS98] Persi Diaconis and Bernd Sturmfels, Algebraic algorithms for sampling fromconditional distributions, Ann. Statist. 26 (1998), no. 1, 363–397.

[DS04] Adrian Dobra and Seth Sullivant, A divide-and-conquer algorithm for gen-erating Markov bases of multi-way tables, Comput. Statist. 19 (2004), no. 3,347–366.

[DS13] Ruth Davidson and Seth Sullivant, Polyhedral combinatorics of UPGMAcones, Adv. in Appl. Math. 50 (2013), no. 2, 327–338. MR 3003350

[DS14] Ruth Davidson and Seth Sullivant, Distance-based phylogenetic methodsaround a polytomy, IEEE/ACM Transactions on Computational Biologyand Bioinformatics 11 (2014), 325–335.

[DSS07] Mathias Drton, Bernd Sturmfels, and Seth Sullivant, Algebraic factor anal-ysis: tetrads, pentads and beyond, Probab. Theory Related Fields 138(2007), no. 3-4, 463–493.

[DSS09] , Lectures on Algebraic Statistics, Oberwolfach Seminars, vol. 39,Birkhauser Verlag, Basel, 2009. MR 2723140

[DW16] Mathias Drton and Luca Weihs, Generic identifiability of linear structuralequation models by ancestor decomposition, Scand. J. Stat. 43 (2016), no. 4,1035–1045. MR 3573674

[Dwo06] Cynthia Dwork, Di↵erential privacy, Automata, languages and program-ming. Part II, Lecture Notes in Comput. Sci., vol. 4052, Springer, Berlin,2006, pp. 1–12. MR 2307219

[DY10] Mathias Drton and Josephine Yu, On a parametrization of positive semi-definite matrices with zeros, SIAM J. Matrix Anal. Appl. 31 (2010), no. 5,2665–2680. MR 2740626 (2011j:15057)

[EAM95] Robert J. Elliott, Lakhdar Aggoun, and John B. Moore, Hidden Markovmodels, Applications of Mathematics (New York), vol. 29, Springer-Verlag,New York, 1995, Estimation and control. MR 1323178

Bibliography 461

[DMM10] Alicia Dickenstein, Laura Felicia Matusevich, and Ezra Miller, Combina-torics of binomial primary decomposition, Math. Z. 264 (2010), no. 4, 745–763. MR 2593293 (2011b:13063)

[Dob02] Adrian Dobra, Statistical tools for disclosure limitation in multi-way con-tingency tables, ProQuest LLC, Ann Arbor, MI, 2002, Thesis (Ph.D.)–Carnegie Mellon University. MR 2703763

[Dob03] Adrian Dobra, Markov bases for decomposable graphical models, Bernoulli9 (2003), no. 6, 1–16.

[DP17] Mathias Drton and Martyn Plummer, A Bayesian information criterionfor singular models, J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 (2017),no. 2, 323–380, With discussions. MR 3611750

[DR72] John N. Darroch and D. Ratcli↵, Generalized iterative scaling for log-linearmodels, Ann. Math. Statist. 43 (1972), 1470–1480.

[DR08] Mathias Drton and Thomas S. Richardson, Graphical methods for e�cientlikelihood inference in Gaussian covariance models, J. Mach. Learn. Res. 9(2008), 893–914. MR 2417257

[Drt09a] Mathias Drton, Discrete chain graph models, Bernoulli 15 (2009), 736–753.

[Drt09b] , Likelihood ratio tests and singularities, Ann. Statist. 37 (2009),no. 2, 979–1012. MR 2502658 (2010d:62143)

[DS98] Persi Diaconis and Bernd Sturmfels, Algebraic algorithms for sampling fromconditional distributions, Ann. Statist. 26 (1998), no. 1, 363–397.

[DS04] Adrian Dobra and Seth Sullivant, A divide-and-conquer algorithm for gen-erating Markov bases of multi-way tables, Comput. Statist. 19 (2004), no. 3,347–366.

[DS13] Ruth Davidson and Seth Sullivant, Polyhedral combinatorics of UPGMAcones, Adv. in Appl. Math. 50 (2013), no. 2, 327–338. MR 3003350

[DS14] Ruth Davidson and Seth Sullivant, Distance-based phylogenetic methodsaround a polytomy, IEEE/ACM Transactions on Computational Biologyand Bioinformatics 11 (2014), 325–335.

[DSS07] Mathias Drton, Bernd Sturmfels, and Seth Sullivant, Algebraic factor anal-ysis: tetrads, pentads and beyond, Probab. Theory Related Fields 138(2007), no. 3-4, 463–493.

[DSS09] , Lectures on Algebraic Statistics, Oberwolfach Seminars, vol. 39,Birkhauser Verlag, Basel, 2009. MR 2723140

[DW16] Mathias Drton and Luca Weihs, Generic identifiability of linear structuralequation models by ancestor decomposition, Scand. J. Stat. 43 (2016), no. 4,1035–1045. MR 3573674

[Dwo06] Cynthia Dwork, Di↵erential privacy, Automata, languages and program-ming. Part II, Lecture Notes in Comput. Sci., vol. 4052, Springer, Berlin,2006, pp. 1–12. MR 2307219

[DY10] Mathias Drton and Josephine Yu, On a parametrization of positive semi-definite matrices with zeros, SIAM J. Matrix Anal. Appl. 31 (2010), no. 5,2665–2680. MR 2740626 (2011j:15057)

[EAM95] Robert J. Elliott, Lakhdar Aggoun, and John B. Moore, Hidden Markovmodels, Applications of Mathematics (New York), vol. 29, Springer-Verlag,New York, 1995, Estimation and control. MR 1323178

References (cont.)• https://www.math.leidenuniv.nl/~vangaans/jancol1.pdf

• van der Vaart & Wellner (1996), Weak Convergence and Empirical Processes

• Ghosal & van der Vaart (2017), Fundamentals of Nonparametric Bayesian Inference

• Aad van der Vaart (2019), personal communication

https://www.math.leidenuniv.nl/~vangaans/jancol1.pdf

Documents

The intersection axiom of - Universiteit Leidengill/Intersection.pdf · experiments, Grobner bases, real algebraic geometry, exponential families, exact test, maximum likelihood degree,