78
Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics Rabi Bhattacharya, The University of Arizona, Tucson, AZ [Research supported by NSF grant DMS1406872] June, 2016 Based on joint work with A. Bhattacharya (BB), Lizhen Lin and V. Patrangenaru (BP)

Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

Analysis of Non-Euclidean Data: Use ofDifferential Geometry in Statistics

Rabi Bhattacharya, The University of Arizona, Tucson, AZ

[Research supported by NSF grant DMS1406872]

June, 2016

Based on joint work with A. Bhattacharya (BB), Lizhen Linand V. Patrangenaru (BP)

Page 2: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

CONTENTS

1. Introduction

2. Frechet Means of Probabilities on Metric Spaces: Uniqueness;Consistency & CLT for Sample Frechet Means.

3. Applications and Examples.

(a). Sd (Sphere) Paleomagnetism.(b). Kendall’s Planar Shape Space Σk

2-Two-Sample Tests for (1)Schizophrenia and (2) Male & Female Gorilla Skulls.

(c). 3D Shape Space RΣk3-Match Pair Test for Glaucoma.

(d). The space Sym+(p) of positive definite matrices.(e). Stratified Spaces (1) Σk

m (m > 2), (2) Open Book.

4. Nonparametric Bayes Theory on Manifolds & Applications.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 3: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

GEOMETRY. A Manifold M of dimension d – a metric spacewith each point having a neighborhood diffeomorphic to an openset in Rd ; these maps on intersecting neighborhoods are smoothlyconnected.

EXAMPLE 1. Sphere Sd = {x ∈ Rd+1 : ‖x‖ = 1} (d ≥ 1).(Covered by two stereographic maps)

Extrinsic, or chord, distance d(p, q) = ‖p − q‖ (Euclideandistance inherited from an embedding J : M → RN). On Sd ,J is the inclusion map: Sd → Rd+1.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 4: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

Geometry

A tangent vector v at p is the derivative dc(t)/dt at t = 0 ofa smooth curve c(t), 0 ≤ t ≤ a, with c(0) = p, on M(computed in local coordinates in a nbd. of p). The set oftangent vectors at p is a d-dimensional vector space Tp(M).Tp(Sd) = {v ∈ Rd+1 : v orthogonal to p}.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 5: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

Geometry

Geodesic, or intrinsic, distance ρg (p, q): Arc lengthminimizing distance along smooth curves [depends on ametric tensor g providing inner products smoothly on thetangent spaces of M]. Arc length of a curve c(t), 0 ≤ t ≤ a,from p to q is

∫[0,a] |dc(t)/dt| dt . On Sd , ρg (p, q) = arc

length along the big circle joining p and q.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 6: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

Geometry

Geodesics on M – (locally) minimize geodesic distancesbetween points. A geodesic from p is entirely determined by(the initial point p and) a tangent vector v at p. On Sd

geodesics are the big circles.

Cut point of p is the point along a geodesic from p beyondwhich the geodesic arc length is not distance minimizing. Cutlocus of p is the collection of all cut points of p. On Sd thecut point of p (along each geodesic) is −p (antipodal point),so the cut locus of p is {−p}.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 7: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

Geometry

Geodesics on M – (locally) minimize geodesic distancesbetween points. A geodesic from p is entirely determined by(the initial point p and) a tangent vector v at p. On Sd

geodesics are the big circles.

Cut point of p is the point along a geodesic from p beyondwhich the geodesic arc length is not distance minimizing. Cutlocus of p is the collection of all cut points of p. On Sd thecut point of p (along each geodesic) is −p (antipodal point),so the cut locus of p is {−p}.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 8: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

Example 2.

EXAMPLE 2. M = Sd/G – the space of orbits of Sd undera (Lie) group of isometries G of Sd .

For p ∈ Sd , [p] = {hp : h ∈ G} is the orbit of p, andM = {[p] : p ∈ Sd}.

Intrinsic distance ρg ([p], [q]) = inf{ρg (hp, h′q) : h, h′ ∈ G}.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 9: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

Example 2(a). Axial Space RPd

Axial SpaceRPd = {Set of all lines passing through the origin in Rd+1},also identified as {the set of pairs of points (p,−p) : p ∈ Sd},and as Sd/G, where G = {h, Id}, with hp = −p.

Cut locus of a point can be identified with RPd−1.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 10: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

Example 2 (b): Kendall’s planar shape space Σk2 (k > 2).

A k-ad is a set of k points {(x1, y1), . . . , (xk , yk)} in R2, notall the same.

Σk2 is the set of all k-ads modulo translation scaling and

rotation in the plane.

That is, first subtract from each (xi , yi ) the mean of the kpoints; then divide the centered vector by its Euclidean normto get the pre-shape sphere identified as S2k−3. Then letΣk

2 = S2k−3/G, where G = SO(2), the space of rotations inthe plane (a Lie group of isometries of dimension 1).

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 11: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

Example 2 (b): Kendall’s planar shape space Σk2 (k > 2).

A k-ad is a set of k points {(x1, y1), . . . , (xk , yk)} in R2, notall the same.

Σk2 is the set of all k-ads modulo translation scaling and

rotation in the plane.

That is, first subtract from each (xi , yi ) the mean of the kpoints; then divide the centered vector by its Euclidean normto get the pre-shape sphere identified as S2k−3. Then letΣk

2 = S2k−3/G, where G = SO(2), the space of rotations inthe plane (a Lie group of isometries of dimension 1).

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 12: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

Example 2 (b): Kendall’s shape space Σk2 .

Hence Σk2 has dimension 2k − 4 and is called Kendall’s space

of planar shapes.

When the k-ads are represented as points on the complexplane, and are centered, then it lies on a space isomorphic toCk−1, and scaling and a rotation of a point p in Ck−1 can berepresented as {λp : λ ∈ C}, i.e., a complex line passingthrough the origin and p. The space of all such points is thecomplex projective space CPk−2. Cut Locus of a point can beidentified with CPk−3, that is, with Σk−1

2 .

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 13: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

FRECHET MEANS ON METRIC SPACES

Frechet function of a probability distribution Q is

F (p) =

∫ρ2(p, x)Q(dx), p ∈ M.

Frechet mean set is the set of minimizers of F . A uniqueminimizer is called the Frechet mean of Q, say µ . SampleFrechet mean µn is a measurable selection from the mean setof the empirical Qn based on i.i.d. X1, · · · ,Xn ∼ Q.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 14: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

FRECHET MEANS ON METRIC SPACES

Frechet function of a probability distribution Q is

F (p) =

∫ρ2(p, x)Q(dx), p ∈ M.

Frechet mean set is the set of minimizers of F . A uniqueminimizer is called the Frechet mean of Q, say µ . SampleFrechet mean µn is a measurable selection from the mean setof the empirical Qn based on i.i.d. X1, · · · ,Xn ∼ Q.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 15: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

FRECHET MEANS ON METRIC SPACES

Proposition 1.(Ziezold,1977; BP,2003) Let F be finite. (i)then the Frechet mean set is nonempty compact. (ii) in caseof a unique minimum. µn → µ (with probability one).

Remark 1.The extrinsic mean based on ρ inherited from Euclidean spaceEN via an embedding J

J : M → EN

is given by µ = J−1(PJ(M)µJ(Q)), if the projection PJ(M) ofthe Euclidean mean µJ of Q ◦ J−1 on J(M) is unique.

If M is Riemannian and ρ is the geodesic distance, then theFrechet minimizer is called the intrinsic mean (if unique) (anopen problem).

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 16: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

FRECHET MEANS ON METRIC SPACES

Proposition 1.(Ziezold,1977; BP,2003) Let F be finite. (i)then the Frechet mean set is nonempty compact. (ii) in caseof a unique minimum. µn → µ (with probability one).

Remark 1.The extrinsic mean based on ρ inherited from Euclidean spaceEN via an embedding J

J : M → EN

is given by µ = J−1(PJ(M)µJ(Q)), if the projection PJ(M) ofthe Euclidean mean µJ of Q ◦ J−1 on J(M) is unique.If M is Riemannian and ρ is the geodesic distance, then theFrechet minimizer is called the intrinsic mean (if unique) (anopen problem).

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 17: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

FRECHET MEANS ON METRIC SPACES

Remark 2. The embedding J considered here is equivariantunder the action of a large Lie group G :∃ a group homomorphism Φ : G → GL(N,EN), g → Φ(g)such that

Φ(g)(J(x)) = J(gx), ∀g ∈ G , x ∈ M.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 18: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

AN OMNIBUS CENTRAL LIMIT THEOREM(Bhattacharya and Lin (2016))

We make the following assumptions.

(A1) The Frechet mean µ of Q is unique.

(A2) µ ∈ G , G ⊂ M, ∃ a homeomorphism φ : G → U, open ⊂ Rs

(s ≥ 1) and

x 7→ h(x ; q) := ρ2(φ−1(x), q) (1)

is C 2 on U, for every q outside a Q-null set.

(A3) P(µn belongs to G )→ 1 as n→∞.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 19: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

AN OMNIBUS CENTRAL LIMIT THEOREM

(A4) Let Drh(x ; q) = ∂h(x ; q)/∂xr , r = 1, . . . , s. Then

E |Drh(φ(µ);Y1)|2 <∞, E |Dr ,r ′h(φ(µ);Y1)| <∞ r , r ′ = 1, . . . , s.(2)

(A5) Let ur ,r ′(ε; q) = sup{|Dr ,r ′h(θ; q)− Dr ,r ′h(φ(µ); q)| :|θ − φ(µ)| < ε}. Then

E |ur ,r ′(ε;Y1)| → 0 as ε→ 0 for all 1 ≤ r , r ′ ≤ s. (3)

(A6) The matrix Λ = [EDr ,r ′h(φ(µ);Y1)]r ,r ′=1,...,s is nonsingular.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 20: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

An OMNIBUS CENTRAL LIMIT THEOREM FORFRECHET MEAN

Theorem 2.1 (Bhattacharya and Lin (2016))

Under assumptions (A1)-(A6) ,

n1/2[φ(µn)− φ(µ)]L−→ N(0,Λ−1CΛ−1), as n→∞, (4)

where C is the covariance matrix of {Drh(φ(µ);Y1), r = 1, . . . , s}.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 21: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

A CENTRAL LIMIT THEOREM FOR FRECHET MEAN

Remark 3. For the intrinsic mean, Theorem 2.1 holds only ifQ assigns probability zero to a neighborhood of the cut locusof µ.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 22: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

A CLT for Intrinsic Means

Theorem 2.2 (Bhattacharya and Lin (2016))

Let C (B) denote the set of cut loci of points p ∈ B. Also, let φ bethe log map, or Exp−1. Suppose that Q has an intrinsic mean µ,and that Q is absolutely continuous in a neighborhood W of thecut locus of µ with a continuous density there with respect to thevolume measure. Assume also that

(i) Q(C (B(µ; ε))) = O(εd−c), ε→ 0, for some c, 0 ≤ c < d ;

(ii) on some neighborhood V of ν = φ(µ) = 0 the functionθ → F

(φ−1(θ)

)is twice continuously differentiable with a

nonsingular Hessian Λ(θ), and

(iii) (A4) holds with φ(µ) replaced by θ, ∀θ ∈ V .

Then, if d > c + 2, one has the CLT (4) for sample mean µn.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 23: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

A CLT for Intrinsic Means on Sd

Corollary 2.3

Let M = Sd , d > 2. If Q has a C 2 density and has a uniqueintrinsic mean then the CLT holds for the sample intrinsic mean.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 24: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

EXAMPLES & APPLICATIONS

2(a). Example 1 (Sd).

Let X1, . . . ,Xn be i.i.d on Sd . The von Mises-Fisherdistribution on the sphere Sd has the following density (w.r.t.the uniform distribution on Sd).

f (x ;µ, τ) = Cd(τ)exp{τ < x , µ >}, x ∈ Sd (µ ∈ Sd , τ ≥ 0).(5)

Intrinsic & extrinsic means are both µ. The MLE of µ is thesample extrinsic mean µn,E .

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 25: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

APPLICATIONS of Sd

Application 1 (Paleomagnetism). In a seminal paper, Fisher(1953) estimated mean directions of the magnetic pole for two setsof data-from a recent period and from a geologically differentperiod in the past. Using the model (5), he constructed confidenceregions for the two mean directions, and provided convincingevidence the the polarities had nearly reversed. We compareFisher’s confidence regions for the extrinsic mean for two sets ofdata.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 26: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

0.1

0.2

0.3

0.4

0.5

−0.1

0

0.1

0.2

0.3

0.88

0.9

0.92

0.94

0.96

0.98

1

Figure: Confidence regions for the direction of earth’ s magnetic poles,using Fisher’ s method (red) and the nonparametric extrinsic method(blue), in Fisher’ s first example.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 27: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

−0.05

0

0.05

0.1

0.15

−0.05

0

0.05

0.1

0.15

0.985

0.99

0.995

1

Figure: Confidence regions for the direction of earth’ s magnetic poles,using Fisher’ s method (red) and the nonparametric extrinsic method(blue), based on the Jurassic period data of Irving (1963).

In both cases, Fishers confidence regions are about 10% larger (inarea) than those given by the nonparametric method.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 28: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

KENDALL’S SHAPE SPACES

2(b). KENDALL’S SHAPE SPACES Σkm.

Each observation x = (x1, . . . , xk) of k > m points inm-dimension (not all the same)-k locations on anm-dimensional object. k-ads are equivalent mod G : a groupG of transformations.

G is generated by translations, scaling (to unit size), rotations.

Preshape

u = (x1− < x >, . . . , xk− < x >)/||x− < x > ||

u ∈ Sm(k−1)−1, the preshape sphere.

Shape of k-ad σ(x) ∈ Sm(k−1)−1/SO(m) = Σkm.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 29: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

KENDALL’S SHAPE SPACES

2(b). KENDALL’S SHAPE SPACES Σkm.

Each observation x = (x1, . . . , xk) of k > m points inm-dimension (not all the same)-k locations on anm-dimensional object. k-ads are equivalent mod G : a groupG of transformations.

G is generated by translations, scaling (to unit size), rotations.

Preshape

u = (x1− < x >, . . . , xk− < x >)/||x− < x > ||

u ∈ Sm(k−1)−1, the preshape sphere.

Shape of k-ad σ(x) ∈ Sm(k−1)−1/SO(m) = Σkm.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 30: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

KENDALL’S SHAPE SPACES

2(b). KENDALL’S SHAPE SPACES Σkm.

Each observation x = (x1, . . . , xk) of k > m points inm-dimension (not all the same)-k locations on anm-dimensional object. k-ads are equivalent mod G : a groupG of transformations.

G is generated by translations, scaling (to unit size), rotations.

Preshape

u = (x1− < x >, . . . , xk− < x >)/||x− < x > ||

u ∈ Sm(k−1)−1, the preshape sphere.

Shape of k-ad σ(x) ∈ Sm(k−1)−1/SO(m) = Σkm.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 31: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

KENDALL’S SHAPE SPACES

Case m = 2. Planar shapes. M = Σk2 .

σ(x) = σ(u) ≡ [u] = {e iθu : −π < θ ≤ π}.

M ' S2k−3/SO(2) ' CPk−2 (Complex projective space)

Extrinsic mean µE : Embedding:

J : σ(x) 7→ uu* ∈ S(k ,C)(k × k Hermitian matrices)

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 32: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

KENDALL’S SHAPE SPACES

Case m = 2. Planar shapes. M = Σk2 .

σ(x) = σ(u) ≡ [u] = {e iθu : −π < θ ≤ π}.

M ' S2k−3/SO(2) ' CPk−2 (Complex projective space)

Extrinsic mean µE : Embedding:

J : σ(x) 7→ uu* ∈ S(k ,C)(k × k Hermitian matrices)

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 33: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

KENDALL’S SHAPE SPACES

Proposition 2 .(BP(2003)) µE exists iff the largesteigenvalue λ of E (uu*) is simple. [J(µE ) = µ0µ

∗0, µ0 unit

eigenvector for λ].

Case m > 2. Σkm has singularities. Action of SO(m) is not

free on M.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 34: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

KENDALL’S SHAPE SPACES

Remark 4 (Riemannian Submersions). Other than regularsubmanifolds of ED , such as Sd , which inherit the Euclideanmetric tensor, most of the manifolds of interest here are of theform of M = N/G , where N is a Riemannian manifold and Gis a Lie group of isometries on N. M is the space of orbitsOx = {gx , g ∈ G} (x ∈ N). The tangent space Tp(M) atp = Ox ∈ M is the horizontal subspace of Tx(N), orthogonalto the direction along the orbit. Tp(M) inherits the metrictensor from Tx(N) on this subspace.

Example: N = S2(k−1)−1 = S2k−3. G = SO(2), M = Σk2 .

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 35: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

EXAMPLES & APPLICATIONS FOR Σk2

Two-sample problem: discrimination between two shapedistributions.APPLICATION 2.(Bookstein (1991), Dryden and Mardia (1998),BP (2005), BB(2008), (2012)). Brain scan shapes of schizophrenicand normal patients.k = 13 landmarks were recorded on the midsagittal slice of thebrain scan of each of n1 = 14 schizophrenic patients and n2 = 14normal patients (Bookstein (1991)). Shape space Σ13

2 .

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 36: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.414 normal children 13 landmarks, along with the mean shape

(a)

−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.414 schizophrenic children 13 landmarks, along with the mean shape

(b)

Figure: (a) and (b) show 13 landmarks for 14 normal and 14schizophrenic children respectively along with the respective meanshapes. * correspond to the mean shapes’ landmarks.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 37: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

1

2

3

4

5

6

7

8

Figure: The sample extrinsic means for the 2 groups along with thepooled sample mean, corresponding to Figure 3.

p-value: nonparametric tests (intrinsic & extrinsic) 4× 10−11

Goodalls parametric test 0.01 Hotellings T 2 test 0.66Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 38: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

EXAMPLES & APPLICATIONS FOR Σk2

APPLICATION 3. Shapes of male and female gorilla skulls.k = 8 landmarks, n1 = 29 male skulls, n2 = 30 female skulls (BP(2005), BB(2008), (2012), Dryden & Mardia (1998)). Shapespace Σ8

2.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 39: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

(a) (b)

Figure: (a) and (b) show 8 landmarks from skulls of 30 female and 29male gorillas respectively along with the respective sample mean shapes.* correspond to the mean shapes’ landmarks.

p-value: Nonparametric tests (intrinsic & extrinsic) < 10−16

Parametric test (Hotellings t2 test, boxs m-test) 0.0001Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 40: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2 (c). 3D Shape Space RΣk3-Match Pair Test for Glaucoma

Assume affine span of each k-ad is Rm, with preshapeu = (u1, . . . , uk) ∈ Sm(k−1)−1.

Shape σ(x) ∈ Sm(k−1)−1/O(m) = M. Embedding

J : σ(x) 7→ ((ui · uj)) (M → S0+(k,R))

(Bandulasiri and Patrangenaru (2005), Bandulasiri and BP(2009), Dryden, Kume, Le, Wood (2008))

Proposition 3.(A. Bhattacharya (2008)). Let λ1 ≥ . . . ≥ λkbe the eigenvalues of E ((ui · uj)), with correspondingorthonormal eigenvectors U1, . . . ,Uk . Then (i) µE exists iffλm > λm+1 and then (ii) J(µE ) = (v1, . . . , vm)(v1, . . . , vm)t ,where vj = (λj − λ+ 1/m)1/2Uj , with λ = (λ1 + . . .+λm)/m.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 41: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2 (c). 3D Shape Space RΣk3-Match Pair Test for Glaucoma

Assume affine span of each k-ad is Rm, with preshapeu = (u1, . . . , uk) ∈ Sm(k−1)−1.

Shape σ(x) ∈ Sm(k−1)−1/O(m) = M. Embedding

J : σ(x) 7→ ((ui · uj)) (M → S0+(k,R))

(Bandulasiri and Patrangenaru (2005), Bandulasiri and BP(2009), Dryden, Kume, Le, Wood (2008))

Proposition 3.(A. Bhattacharya (2008)). Let λ1 ≥ . . . ≥ λkbe the eigenvalues of E ((ui · uj)), with correspondingorthonormal eigenvectors U1, . . . ,Uk . Then (i) µE exists iffλm > λm+1 and then (ii) J(µE ) = (v1, . . . , vm)(v1, . . . , vm)t ,where vj = (λj − λ+ 1/m)1/2Uj , with λ = (λ1 + . . .+λm)/m.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 42: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2 (c). 3D Shape Space RΣk3-Match Pair Test for Glaucoma

Assume affine span of each k-ad is Rm, with preshapeu = (u1, . . . , uk) ∈ Sm(k−1)−1.

Shape σ(x) ∈ Sm(k−1)−1/O(m) = M. Embedding

J : σ(x) 7→ ((ui · uj)) (M → S0+(k,R))

(Bandulasiri and Patrangenaru (2005), Bandulasiri and BP(2009), Dryden, Kume, Le, Wood (2008))

Proposition 3.(A. Bhattacharya (2008)). Let λ1 ≥ . . . ≥ λkbe the eigenvalues of E ((ui · uj)), with correspondingorthonormal eigenvectors U1, . . . ,Uk . Then (i) µE exists iffλm > λm+1 and then (ii) J(µE ) = (v1, . . . , vm)(v1, . . . , vm)t ,where vj = (λj − λ+ 1/m)1/2Uj , with λ = (λ1 + . . .+λm)/m.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 43: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

APPLICATION of (RΣk3)

To detect any shape change of the inner eye due to glaucoma, 3Dimages of the optical nerve head (ONH) of both eyes of 12 maturerhesus monkeys were recorded. One of the eyes was subjected toincreased intraocular pressure (IOP). k = 5 landmarks of the innereye were measured on each eye. For this match pair experiment,the manifold is RΣk

3 × RΣk3 . The null hypothesis is that the

(extrinsic) mean lies on the diagonal of this product manifold(BP(2005), BB(2009)). p-value of the nonparametric chisquaretest is (BB(2009)) 1.55× 10−5.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 44: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2 (d).Sym+(p)–p × p Positive Definite Matrices

1. Euclidean metric: ‖A‖2 = Trace(A)2. Sym+(p) openconvex subset of Sym(p), and Q on Sym+(p) has Euclideanmean µE =

∫AQ(dA).

2. log-Euclidean metric (Arsigney et al. (2006)).J ≡ log : Sym+(p)→ Sym(p) is the inverse of the exponentialmap B → eB , Sym(p)→ Sym+(p). (Diffeomorphism).dLE (A1,A2) = ‖ log(A1)− log(A2)‖.µLE = exp(

∫(log(A))Q(dA)). (Extrinsic mean under J).

Also, intrinsic mean under bi-invariant metric of Sym+(p) as aLie group: A1 ◦ A2 = exp(log(A1) + log(A2)) (zero-curvature).

3. Affine invariant metric.d2AI (A1,A2) = ‖ log(A

−1/21 A2A

−1/21 )‖2.

〈B1,B2〉 = Trace(A−1B1A−1B2 (Non-positive curvature).

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 45: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2 (d).Sym+(p)–p × p Positive Definite Matrices

1. Euclidean metric: ‖A‖2 = Trace(A)2. Sym+(p) openconvex subset of Sym(p), and Q on Sym+(p) has Euclideanmean µE =

∫AQ(dA).

2. log-Euclidean metric (Arsigney et al. (2006)).J ≡ log : Sym+(p)→ Sym(p) is the inverse of the exponentialmap B → eB , Sym(p)→ Sym+(p). (Diffeomorphism).dLE (A1,A2) = ‖ log(A1)− log(A2)‖.µLE = exp(

∫(log(A))Q(dA)). (Extrinsic mean under J).

Also, intrinsic mean under bi-invariant metric of Sym+(p) as aLie group: A1 ◦ A2 = exp(log(A1) + log(A2)) (zero-curvature).

3. Affine invariant metric.d2AI (A1,A2) = ‖ log(A

−1/21 A2A

−1/21 )‖2.

〈B1,B2〉 = Trace(A−1B1A−1B2 (Non-positive curvature).

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 46: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2 (d).Sym+(p)–p × p Positive Definite Matrices

1. Euclidean metric: ‖A‖2 = Trace(A)2. Sym+(p) openconvex subset of Sym(p), and Q on Sym+(p) has Euclideanmean µE =

∫AQ(dA).

2. log-Euclidean metric (Arsigney et al. (2006)).J ≡ log : Sym+(p)→ Sym(p) is the inverse of the exponentialmap B → eB , Sym(p)→ Sym+(p). (Diffeomorphism).dLE (A1,A2) = ‖ log(A1)− log(A2)‖.µLE = exp(

∫(log(A))Q(dA)). (Extrinsic mean under J).

Also, intrinsic mean under bi-invariant metric of Sym+(p) as aLie group: A1 ◦ A2 = exp(log(A1) + log(A2)) (zero-curvature).

3. Affine invariant metric.d2AI (A1,A2) = ‖ log(A

−1/21 A2A

−1/21 )‖2.

〈B1,B2〉 = Trace(A−1B1A−1B2 (Non-positive curvature).

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 47: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2 (d).Sym+(p)–p × p Positive Definite Matrices

APPLICATIONS (p=3). DTI (Diffusion Tensor Imaging)provides measurements of the diffusion matrix of watermolecules in tiny voxels in the white matter of the brain.Anistropy in the presence of the structural barriers of nervefibers is reduced when a trauma occurs (Parkinsons,Alzheimers,...). Challenges to statistical inference. Also seeSchwartzman (2014).

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 48: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

APPLICATIONS (p = 3)- HIV IMAGING DATA

Diffusion-weighted images were acquired for each of 46subjects with 28 HIV+ subjects and 18 healthy controls.

In the previous DTI findings, the diffusion tensors in thesplenium of the corpus callosum were found significantlydifferent between the HIV+ and control group. We examinethe finite sample performance of our method by using thisfiber tract.

Diffusion tensor were constructed for 75 voxels along the fiber.

In order to detect meaningful group differences, registration iscrucial. The 46 HIV DTI data used in our studies, includingthe splenium tracts and diffusion tensors on them, wereregistered in the same atlas space.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 49: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

APPLICATIONS (p = 3)- HIV IMAGING DATA

Diffusion-weighted images were acquired for each of 46subjects with 28 HIV+ subjects and 18 healthy controls.

In the previous DTI findings, the diffusion tensors in thesplenium of the corpus callosum were found significantlydifferent between the HIV+ and control group. We examinethe finite sample performance of our method by using thisfiber tract.

Diffusion tensor were constructed for 75 voxels along the fiber.

In order to detect meaningful group differences, registration iscrucial. The 46 HIV DTI data used in our studies, includingthe splenium tracts and diffusion tensors on them, wereregistered in the same atlas space.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 50: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

APPLICATIONS (p = 3)- HIV IMAGING DATA

We first carry out the two-sample testing (voxel-wise) using atesting statistics based on the usual Euclidean distance.

(X − Y )Σ−1(X − Y )T

where X and Y are the sample mean vector of dimension 6 ofX and Y respectively, Σ = (1/n1ΣX + 1/n2ΣY ).

The testing statistics has a asymptotic chisquare distributionχ2(6).

Next is a plot of the p-values along the fiber tracks.

We can apply the Benjamin-Yekutieli procedure to controlfalse discovery rate.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 51: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

APPLICATIONS (p = 3)- HIV IMAGING DATA

0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Arc length

p−

va

lue

s

p−values along the fibers using Euclidean distance

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 52: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

FALSE DISCOVERY RATE. BENJAMIN-HOCHBERGPROCEDURE

Set α = 0.05. Apply Benjamini-Hochberg procedure to thetests.

Reject only the k null hypothesis with the smallest p-values,where k = max{i : p(i) ≤ 1

mα}.

In our example we first order the 75 p-values corresponding tothe tests carried out at all the locations.

The ordered p-values are compared with the vector{0.05/75, 0.1/75, . . . , 0.05}, which gives the result k = 58.

Therefore we reject the 58 null hypotheses corresponding tothe first 58 ordered p-values.

The false discovery rate is smaller than m0/mα ≤ α.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 53: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

FALSE DISCOVERY RATE. BENJAMIN-HOCHBERGPROCEDURE

Set α = 0.05. Apply Benjamini-Hochberg procedure to thetests.

Reject only the k null hypothesis with the smallest p-values,where k = max{i : p(i) ≤ 1

mα}.In our example we first order the 75 p-values corresponding tothe tests carried out at all the locations.

The ordered p-values are compared with the vector{0.05/75, 0.1/75, . . . , 0.05}, which gives the result k = 58.

Therefore we reject the 58 null hypotheses corresponding tothe first 58 ordered p-values.

The false discovery rate is smaller than m0/mα ≤ α.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 54: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

FALSE DISCOVERY RATE. BENJAMIN-HOCHBERGPROCEDURE

Set α = 0.05. Apply Benjamini-Hochberg procedure to thetests.

Reject only the k null hypothesis with the smallest p-values,where k = max{i : p(i) ≤ 1

mα}.In our example we first order the 75 p-values corresponding tothe tests carried out at all the locations.

The ordered p-values are compared with the vector{0.05/75, 0.1/75, . . . , 0.05}, which gives the result k = 58.

Therefore we reject the 58 null hypotheses corresponding tothe first 58 ordered p-values.

The false discovery rate is smaller than m0/mα ≤ α.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 55: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

APPLICATIONS (p=3)- HIV IMAGING DATA

Second, we carry out two-sample testings based on thelog-Euclidean distance of the DTI matrices. The matrix log ofeach raw diffusion matrix is first calculated. The testingstatistics is based on the Euclidean distance of the 6 distinctvalues of the log matrices.

Next is a plot of the p-values along the fiber tracks.

To control false discovery rate, we also carry out theBenjamini-Hochberg procedure. We reject the first 48 testsbased on the order p-values.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 56: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

APPLICATIONS (p=3)- HIV IMAGING DATA

0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Arc length

p−

va

lue

s

p−values along the fibers using log−Euclidean distance

p−value

0.05

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 57: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

Plot of the p-values

0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Arc length

p−

va

lue

s

p−values along the fibers using Euclidean and log−Euclidean distance

log−Euclidean

0.05

Euclidean

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 58: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2(e). Stratified Spaces (1) Σkm (m > 2)

Stratified spaces S are made up of several subspaces ofdifferent dimensions.

A familiar example is Σkm with m > 2. After translation and

scaling the k-ads lie in (and fill out) a preshape sphereSmk−m−1. The shape space is then viewed asΣkm = Smk−m−1/SO(m).

For simplicity, consider m = 3. One may split Σk3 into two

strata. The larger stratum S1 corresponds to shapes ofnon-collinear k-ads.

S1 is a manifold of dimension 3k − 4− 3 = 3k − 7. Themanifold is not complete in the geodesic distance.

The other stratum S0 comprises shapes of k-ads each k-adbeing a set of k collinear points in R3. Each orbit hasdimension 3. The stratum S0 may be given the structure of adifferentiable manifold of dimension k − 2.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 59: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2(e). Stratified Spaces (1) Σkm (m > 2)

Stratified spaces S are made up of several subspaces ofdifferent dimensions.

A familiar example is Σkm with m > 2. After translation and

scaling the k-ads lie in (and fill out) a preshape sphereSmk−m−1. The shape space is then viewed asΣkm = Smk−m−1/SO(m).

For simplicity, consider m = 3. One may split Σk3 into two

strata. The larger stratum S1 corresponds to shapes ofnon-collinear k-ads.

S1 is a manifold of dimension 3k − 4− 3 = 3k − 7. Themanifold is not complete in the geodesic distance.

The other stratum S0 comprises shapes of k-ads each k-adbeing a set of k collinear points in R3. Each orbit hasdimension 3. The stratum S0 may be given the structure of adifferentiable manifold of dimension k − 2.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 60: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2(e). Stratified Spaces (1) Σkm (m > 2)

Stratified spaces S are made up of several subspaces ofdifferent dimensions.

A familiar example is Σkm with m > 2. After translation and

scaling the k-ads lie in (and fill out) a preshape sphereSmk−m−1. The shape space is then viewed asΣkm = Smk−m−1/SO(m).

For simplicity, consider m = 3. One may split Σk3 into two

strata. The larger stratum S1 corresponds to shapes ofnon-collinear k-ads.

S1 is a manifold of dimension 3k − 4− 3 = 3k − 7. Themanifold is not complete in the geodesic distance.

The other stratum S0 comprises shapes of k-ads each k-adbeing a set of k collinear points in R3. Each orbit hasdimension 3. The stratum S0 may be given the structure of adifferentiable manifold of dimension k − 2.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 61: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2(e). Stratified Spaces (1) Σkm (m > 2)

Stratified spaces S are made up of several subspaces ofdifferent dimensions.

A familiar example is Σkm with m > 2. After translation and

scaling the k-ads lie in (and fill out) a preshape sphereSmk−m−1. The shape space is then viewed asΣkm = Smk−m−1/SO(m).

For simplicity, consider m = 3. One may split Σk3 into two

strata. The larger stratum S1 corresponds to shapes ofnon-collinear k-ads.

S1 is a manifold of dimension 3k − 4− 3 = 3k − 7. Themanifold is not complete in the geodesic distance.

The other stratum S0 comprises shapes of k-ads each k-adbeing a set of k collinear points in R3. Each orbit hasdimension 3. The stratum S0 may be given the structure of adifferentiable manifold of dimension k − 2.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 62: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2(e). Stratified Spaces (2) Open Book (Hotz et al. (2013))

An open book O is the disjoint union of K open leavesL+j (H, j), 1 ≤ j ≤ K , joined at the spine L0 as the common

boundary. Here H = (0,∞)× Rd and L0 = 0× Rd .

The distance ρ on L+j , or L0, is the usual Euclidean distance

on Rd+1, but for j 6= k , ρ((x , j), (y , k)) = |x − Ry |, where Ryis the reflection, Ry = (−y (0), y (1), . . . , y (d)) ∀y = (y (0), y (1), . . . , y (d)) ∈ [0,∞)× Rd .

The open book is a geodesic space with non-positivecurvature in the sense of A.D. Alexandrov and therefore Q hasa unique Frechet mean.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 63: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2(e). Stratified Spaces (2) Open Book (Hotz et al. (2013))

An open book O is the disjoint union of K open leavesL+j (H, j), 1 ≤ j ≤ K , joined at the spine L0 as the common

boundary. Here H = (0,∞)× Rd and L0 = 0× Rd .

The distance ρ on L+j , or L0, is the usual Euclidean distance

on Rd+1, but for j 6= k , ρ((x , j), (y , k)) = |x − Ry |, where Ryis the reflection, Ry = (−y (0), y (1), . . . , y (d)) ∀y = (y (0), y (1), . . . , y (d)) ∈ [0,∞)× Rd .

The open book is a geodesic space with non-positivecurvature in the sense of A.D. Alexandrov and therefore Q hasa unique Frechet mean.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 64: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2(e). Stratified Spaces (2) Open Book (Hotz et al. (2013))

Consider the map Fj : O → Rd+1, Fj((x , j)) = x ,Fj((x , k)) = Rx if k 6= j . Write mj =

∫x (0)(Q ◦ F−1

j )(dx).

Under the assumption Q(L+j ) > 0 for 1 ≤ j ≤ K , either (1)

mj ≥ 0 for some j , and mk < 0 ∀ k 6= j , or (2) mj < 0 ∀ j .

In case (2), the Frechet mean is sticky, that is, withprobability one, µN ∈ L0 for all sufficiently large N. Also, theclassical CLT holds on the d-dimensional space L0.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 65: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2(e). Stratified Spaces (2) Open Book (Hotz et al. (2013))

Consider the map Fj : O → Rd+1, Fj((x , j)) = x ,Fj((x , k)) = Rx if k 6= j . Write mj =

∫x (0)(Q ◦ F−1

j )(dx).

Under the assumption Q(L+j ) > 0 for 1 ≤ j ≤ K , either (1)

mj ≥ 0 for some j , and mk < 0 ∀ k 6= j , or (2) mj < 0 ∀ j .

In case (2), the Frechet mean is sticky, that is, withprobability one, µN ∈ L0 for all sufficiently large N. Also, theclassical CLT holds on the d-dimensional space L0.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 66: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2(e). Stratified Spaces (2) Open Book (Hotz et al. (2013))

Consider the map Fj : O → Rd+1, Fj((x , j)) = x ,Fj((x , k)) = Rx if k 6= j . Write mj =

∫x (0)(Q ◦ F−1

j )(dx).

Under the assumption Q(L+j ) > 0 for 1 ≤ j ≤ K , either (1)

mj ≥ 0 for some j , and mk < 0 ∀ k 6= j , or (2) mj < 0 ∀ j .

In case (2), the Frechet mean is sticky, that is, withprobability one, µN ∈ L0 for all sufficiently large N. Also, theclassical CLT holds on the d-dimensional space L0.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 67: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2(e). Stratified Spaces (2) Open Book

Recall case (1) mj ≥ 0 for some j , and mk < 0 ∀ k 6= j .

If in case (1), mj > 0, then µ lies in the open leaf L+j , as do

µN for all sufficient large N; hence the classical(d + 1)-dimensional CLT holds.

If, however, mj = 0, µ ∈ L0; but µN ∈ Lj if (the empirical)mj ,n > 0 and µN ∈ L0 if mj ,n ≤ 0; hence the asymptoticdistribution centered at µ is the distribution of

((X(0)+ ,X (1), . . . ,X (d)), j) on L

(+)j ∪ L0, where

(X (0),X (1), . . . ,X (d)) has the Gaussian distribution stated

under the preceding case mj > 0, and X(0)+ = max{X (0), 0}.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 68: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2(e). Stratified Spaces (2) Open Book

Recall case (1) mj ≥ 0 for some j , and mk < 0 ∀ k 6= j .

If in case (1), mj > 0, then µ lies in the open leaf L+j , as do

µN for all sufficient large N; hence the classical(d + 1)-dimensional CLT holds.

If, however, mj = 0, µ ∈ L0; but µN ∈ Lj if (the empirical)mj ,n > 0 and µN ∈ L0 if mj ,n ≤ 0; hence the asymptoticdistribution centered at µ is the distribution of

((X(0)+ ,X (1), . . . ,X (d)), j) on L

(+)j ∪ L0, where

(X (0),X (1), . . . ,X (d)) has the Gaussian distribution stated

under the preceding case mj > 0, and X(0)+ = max{X (0), 0}.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 69: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

2(e). Stratified Spaces (2) Open Book

Remark 5. The study of this and some toy models ofphylogenetic trees has been motivated in part by thepioneering work of Susan Holmes and her collaborators.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 70: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

3 (a). NONPARAMETRIC BAYES ONMANIFOLDS-DENSITY ESTIMATIONA. Bhattacharya and D. Dunson (2010, 2012), BB (2012)

Density of Q with a standard measure on M, represented as amixture P(dθ) of a parametric family of densities K (x ; θ)(θ ∈ Θ)

f (x ;P) =

∫ΘK (x ; θ)P(dθ).

P is a probability measure on Θ, is often imposed with aDirichlet process prior (Ferguson (1974)).

Sethuraman’s stick-breaking representation∑

wjδYjof prior

with w1 = u1, wj = uj(1− u1) · · · (1− uj−1) (j > 1). Here ujare i.i.d Beta(1, α(Θ)), where α is the base measure on Θ, Yj

are i.i.d ∼ G = α/α(Θ). Draws from posterior by MCMC.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 71: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

3 (a). NONPARAMETRIC BAYES ONMANIFOLDS-DENSITY ESTIMATIONA. Bhattacharya and D. Dunson (2010, 2012), BB (2012)

Density of Q with a standard measure on M, represented as amixture P(dθ) of a parametric family of densities K (x ; θ)(θ ∈ Θ)

f (x ;P) =

∫ΘK (x ; θ)P(dθ).

P is a probability measure on Θ, is often imposed with aDirichlet process prior (Ferguson (1974)).

Sethuraman’s stick-breaking representation∑

wjδYjof prior

with w1 = u1, wj = uj(1− u1) · · · (1− uj−1) (j > 1). Here ujare i.i.d Beta(1, α(Θ)), where α is the base measure on Θ, Yj

are i.i.d ∼ G = α/α(Θ). Draws from posterior by MCMC.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 72: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

3 (a). NONPARAMETRIC BAYES ONMANIFOLDS-DENSITY ESTIMATION

Example. A density on Σk2 is estimated by the kernel method

(KD) (Pelletier (2005)), NP Bayes and MLE. Simulationstudy yielded the following estimate of the mean L1 distanceof these methods: NP(0.44), KD (0.75), MLE (1.03).

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 73: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

3 (b). NONPARAMETRIC BAYES ONMANIFOLDS-CLASSIFICATIONS

Classifications. Σ82 (Gorilla Skulls). n1 = 30, n2 = 29. 25

randomly chosen from each group as the training samples.The remaining 9 were classified.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 74: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

Figure: Estimated shape densities of gorillas: Female(solid), Male(dot).Estimate(r), 95% C.R.(b,g).

−0.1 −0.05 0 0.05 0.1 0.150

1

2

3

4

5

6

7x 10

18

Predictive densities:Female(−), Male(..)

Densities evaluated at a dense grid of points drawn from the unitspeed geodesic starting at female extrinsic mean in direction ofmale extrinsic mean.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 75: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

REFERENCES

Arsigny, V., Fillard, P., Pennec, X., and Ayache, N. (2006).Magn. Reson. Med.

Bandulasiri, A., Bhattacharya, R. and & Patrangenaru, V.(2009). JMVA.

Bhattacharya, A (2008). Sankhya, A.

Bhattacharya,A & Bhattacharya,R. (2008). Proc. Amer.Math. Soc.

Bhattacharya,A & Bhattacharya,R. (2012). IMS MonographSeries #2.

Bhattacharya,A & Dunson. D. (2010). Biometrika.

Bhattacharya,A & Dunson. D. (2012). Ann. Inst. Statist.Math.

Bhattacharya, R. and Lin, L. (2016). Proc. Amer. Math. Soc.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 76: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

REFERENCES

Bhattacharya,R. & Patrangenaru, V. (2005). Ann. Statist.

Bhattacharya,R. & Patrangenaru, V. (2003). Ann. Statist.

Bhattacharya,R. & Lin, L. (2013).http://arxiv.org/abs/1306.5806

Bookstein, F. (1991). Cambridge University Press.

Dryden, I. L. and Mardia, K. V. (1998). Wiley, New York.

Dryden, I. L., Kume, A., Le, H., and Wood, A.T.A. (2008).Biometrika.

Fisher, R.A. (1953). Proc. Roy. Soc. London.

Ferguson, T. (1973, 1974). Ann. Statist.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 77: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

REFERENCES

Hotz, T., Huckemann, S., Le, H., Marron, J. S., Mattingly, J.C., Miller, E., Nolen, J., Owen, M., Patrangenaru, V., andSkwerer, S. (2013), Ann. Appl. Probab.

Irving, E. (1963), Wiley.

Hendriks, H. and Landsman, Z. (1996). CRA Acad. Sci.

Hendriks, H. and Landsman, Z. (1998). JMVA.

Karcher, H. (1977). Comm. Pure Appl. Math.

Kendall, D.G. (1984). Bull. London. Math. Soc.

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics

Page 78: Analysis of Non-Euclidean Data: Use of Differential Geometry in Statisticssayan/CBMS/Bhattacharya/cbms.pdf · 2016-06-03 · Analysis of Non-Euclidean Data: Use of Di erential Geometry

IntroductionFrechet Mean on Metric Spaces

Examples and ApplicationsNonparametric Bayes Theory on Manifolds.

REFERENCES

Kendall, K. S. and Le. H. (2011). Brazilian J. of Probab. andStatist.

Pellier, B. (2005). Statist. and Probab. Letters.

Schwarzman, A. (2014). To appear.

Sethuraman, J. (1994). Statist. Sinica.

Ziezold, H. (1977). Transactions of the Seventh PragureConference on Information Theory, Statistical Functions,Random Processes and of the Eightth European Meeting ofStatisticians

Rabi Bhattacharya, The University of Arizona, Tucson, AZ Analysis of Non-Euclidean Data: Use of Differential Geometry in Statistics