Maximal Data Piling

Title

Maximal Data PilingVisual similarity of & ?Can show (Ahn & Marron 2009), for d < n:

I.e. directions are the same!How can this be?Note lengths are differentStudy from transformation viewpoint

1Maximal Data PilingRecall Transfoview of FLD:

2Maximal Data PilingInclude Corres-pondingMDP Transfo:

Both giveSameResult!

3Maximal Data PilingDetails:FLD, seping plane normal vectorWithin Class, PC1 PC2Global, PC1 PC2

4Maximal Data PilingAcknowledgement:This viewpointI.e. insight into why FLD = MDP(for low dimal data)Suggested by Daniel Pea5Maximal Data PilingFun e.g:rotate from PCA to MDP dirns

6Maximal Data PilingMDP for other class labellings:Always existsSeparation bigger for natural clustersCould be used for clusteringConsider all directionsFind one that makes largest gapVery hard optimization problemOver 2n-2 possible directions7Maximal Data PilingA point of terminology (Ahn & Marron 2009):MDP is maximal in 2 senses:# of data piledSize of gap (within subspace gend by data)

8Maximal Data PilingRecurring, over-arching, issue:

HDLSS space is a weird place9Maximal Data PilingUsefulness for Classification? First Glance: Terrible, not generalizable HDLSS View: Maybe OK? Simulation Result:Good Performance forAutocorrelated Errors Reason: not known10Kernel EmbeddingAizerman, Braverman and Rozoner (1964) Motivating idea:Extend scope of linear discrimination,By adding nonlinear components to data(embedding in a higher dimal space)

Better use of name:nonlinear discrimination?11Kernel EmbeddingToy Examples:In 1d, linear separation splits the domaininto only 2 parts

12

Kernel EmbeddingBut in the quadratic embedded domain,

linear separation can give 3 parts

13Kernel EmbeddingBut in the quadratic embedded domain

Linear separation can give 3 partsoriginal data space lies in 1d manifoldvery sparse region of curvature of manifold gives: better linear separationcan have any 2 break points(2 points line)

14Kernel EmbeddingStronger effects for higher order polynomial embedding:E.g. for cubic,

linear separation can give 4 parts (or fewer)

15Kernel EmbeddingStronger effects - high. ord. poly. embedding:original space lies in 1-d manifold, even sparser in higher d curvature gives:improved linear separationcan have any 3 break points (3 points plane)?Note: relatively few interesting separating planes

16Kernel EmbeddingGeneral View: for original data matrix:

add rows:

i.e. embed inHigherDimensionalspace

17Kernel EmbeddingEmbedded Fisher Linear Discrimination:Choose Class 1, for any when:

in embedded space.Image of class boundaries in original space is nonlinearAllows more complicated class regionsCan also do Gaussian Lik. Rat. (or others) Compute image by classifying points from original space

18Kernel EmbeddingVisualization for Toy Examples:Have Linear Disc. In Embedded SpaceStudy Effect in Original Data SpaceVia Implied Nonlinear RegionsApproach:Use Test Set in Original Space(dense equally spaced grid)Apply embedded discrimination RuleColor Using the Result19Kernel EmbeddingPolynomial Embedding, Toy Example 1:Parallel Clouds

20PEod1Raw.ps

Kernel EmbeddingPolynomial Embedding, Toy Example 1:Parallel Clouds

PC1OriginalData

21PEod1Raw.ps


PC1

22PEod1Raw.ps


PC1

23PEod1Raw.ps


PC1

24PEod1Raw.ps


PC1

25PEod1Raw.ps


PC1Note:EmbeddingAlways BadDont WantGreatestVariation

26PEod1Raw.ps


FLDOriginalData27PEod1Raw.ps


FLD28PEod1Raw.ps


FLD29PEod1Raw.ps


FLD30PEod1Raw.ps


FLD

All Stable& Very Good31PEod1Raw.ps


GLROriginalData32PEod1Raw.ps


GLR33PEod1Raw.ps


GLR34PEod1Raw.ps


GLR35PEod1Raw.ps


GLR36PEod1Raw.ps


GLRUnstableSubject toOverfitting37PEod1Raw.ps

Documents

Maximal Data Piling