70
Object Orie’d Data Analysis, Last Time • Discrimination for manifold data (Sen) – Simple Tangent plane SVM – Iterated TANgent plane SVM – Manifold SVM • Interesting point: Analysis done really in the manifold – Not just in projected tangent plane Deeper than Principal Geodesic Analysis? • Manifold version of DWD?

Object Orie’d Data Analysis, Last Time

  • Upload
    eytan

  • View
    16

  • Download
    0

Embed Size (px)

DESCRIPTION

Object Orie’d Data Analysis, Last Time. Discrimination for manifold data (Sen) Simple Tangent plane SVM Iterated TANgent plane SVM Manifold SVM Interesting point: Analysis done really in the manifold Not just in projected tangent plane Deeper than Principal Geodesic Analysis? - PowerPoint PPT Presentation

Citation preview

Page 1: Object Orie’d Data Analysis, Last Time

Object Orie’d Data Analysis, Last Time

• Discrimination for manifold data (Sen)– Simple Tangent plane SVM

– Iterated TANgent plane SVM

– Manifold SVM

• Interesting point:

Analysis done really in the manifold– Not just in projected tangent plane

– Deeper than Principal Geodesic Analysis?

• Manifold version of DWD?

Page 2: Object Orie’d Data Analysis, Last Time

Mildly Non-Euclidean Spaces

Useful View of Manifold Data: Tangent Space

Center:Frechét Mean

Reason forterminology“mildly nonEuclidean”

Page 3: Object Orie’d Data Analysis, Last Time

Strongly Non-Euclidean Spaces

Trees as Data Objects

From Graph Theory:

• Graph is set of nodes and edges• Tree has root and direction

Data Objects: set of trees

Page 4: Object Orie’d Data Analysis, Last Time

Strongly Non-Euclidean Spaces

Motivating Example:

• Blood Vessel Trees in Brains

• From Dr. Elizabeth Bullitt– Dept. of Neurosurgery, UNC

• Segmented from MRAs

• Study population of trees

(Forest?)

Page 5: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Marron’s brain:

MRI view

Single Slice

From 3-d Image

Page 6: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Marron’s brain:

MRA view

“A” for Angiography”

Finds blood vessels

(show up as white)

Track through 3d

Page 7: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Marron’s brain:

MRA view

“A” for Angiography”

Finds blood vessels

(show up as white)

Track through 3d

Page 8: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Marron’s brain:

MRA view

“A” for Angiography”

Finds blood vessels

(show up as white)

Track through 3d

Page 9: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Marron’s brain:

MRA view

“A” for Angiography”

Finds blood vessels

(show up as white)

Track through 3d

Page 10: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Marron’s brain:

MRA view

“A” for Angiography”

Finds blood vessels

(show up as white)

Track through 3d

Page 11: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Marron’s brain:

MRA view

“A” for Angiography”

Finds blood vessels

(show up as white)

Track through 3d

Page 12: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Marron’s brain:

From MRA

Segment tree

of vessel segments

Using tube tracking

Bullitt and Aylward (2002)

Page 13: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Marron’s brain:

From MRA

Reconstruct trees

in 3d

Rotate to view

Page 14: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Marron’s brain:

From MRA

Reconstruct trees

in 3d

Rotate to view

Page 15: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Marron’s brain:

From MRA

Reconstruct trees

in 3d

Rotate to view

Page 16: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Marron’s brain:

From MRA

Reconstruct trees

in 3d

Rotate to view

Page 17: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Marron’s brain:

From MRA

Reconstruct trees

in 3d

Rotate to view

Page 18: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Marron’s brain:

From MRA

Reconstruct trees

in 3d

Rotate to view

Page 19: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Now look over many people (data objects)

Structure of population (understand variation?)

PCA in strongly non-Euclidean Space???

, ... ,,

Page 20: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

The tree team:

Very Interdsciplinary

Neurosurgery: Bullitt, Ladha

Statistics: Wang, Marron

Optimization: Aydin, Pataki

Page 21: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Possible focus of analysis:

• Connectivity structure only (topology)

• Location, size, orientation of segments

• Structure within each vessel segment

, ... ,,

Page 22: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Present Focus:

Topology only

Already challenging

Later address others

Then add attributes

To tree nodes

And extend analysis

Page 23: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Recall from above:

Marron’s brain:

Focus on back

Connectivity only

Page 24: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Present Focus:

Topology only

Raw data as trees

Marron’s

reduced tree

Back tree only

Page 25: Object Orie’d Data Analysis, Last Time

Blood vessel tree data

Topology only

E.g. Back Trees

Full Population

Study as movie

Understand variation?

Page 26: Object Orie’d Data Analysis, Last Time

Strongly Non-Euclidean Spaces

Statistics on Population of Tree-Structured Data Objects?

• Mean???• Analog of PCA???

Strongly non-Euclidean, since:• Space of trees not a linear space• Not even approximately linear

(no tangent plane)

Page 27: Object Orie’d Data Analysis, Last Time

Mildly Non-Euclidean Spaces

Useful View of Manifold Data: Tangent Space

Center:Frechét Mean

Reason forterminology“mildly nonEuclidean”

Page 28: Object Orie’d Data Analysis, Last Time

Strongly Non-Euclidean Spaces

Mean of Population of Tree-Structured Data Objects?

Natural approach: Fréchet mean

Requires a metric (distance)

On tree space

n

ii

xxXdX

1

2,minarg

Page 29: Object Orie’d Data Analysis, Last Time

Strongly Non-Euclidean Spaces

Appropriate metrics on tree space:

Wang and Marron (2007)

• Depends on:– Tree structure

– And nodal attributes

• Won’t go further here

• But gives appropriate Fréchet mean

Page 30: Object Orie’d Data Analysis, Last Time

Strongly Non-Euclidean Spaces

Appropriate metrics on tree space:

Wang and Marron (2007)

• For topology only (studied here):– Use Hamming Distance

– Just number of nodes not in common

• Gives appropriate Fréchet mean

Page 31: Object Orie’d Data Analysis, Last Time

Strongly Non-Euclidean Spaces

PCA on Tree Space?• Recall Conventional PCA:Directions that explain structure in data

• Data are points in point cloud• 1-d and 2-d projections allow insights

about population structure

Page 32: Object Orie’d Data Analysis, Last Time

Illust’n of PCA View: PC1 Projections

Page 33: Object Orie’d Data Analysis, Last Time

Illust’n of PCA View: Projections on PC1,2 plane

Page 34: Object Orie’d Data Analysis, Last Time

Source Batch Adj: PC 1-3 & DWD direction

Page 35: Object Orie’d Data Analysis, Last Time

Source Batch Adj: DWD Source Adjustment

Page 36: Object Orie’d Data Analysis, Last Time

Strongly Non-Euclidean Spaces

PCA on Tree Space?

Key Ideas:

• Replace 1-d subspace

that best approximates data

• By 1-d representation

that best approximates data

Wang and Marron (2007) define notion of

Treeline (in structure space)

Page 37: Object Orie’d Data Analysis, Last Time

Strongly Non-Euclidean Spaces

PCA on Tree Space: Treeline

• Best 1-d representation of data

Basic idea:

• From some starting tree

• Grow only in 1 “direction”

Page 38: Object Orie’d Data Analysis, Last Time

Strongly Non-Euclidean Spaces

PCA on Tree Space: Treeline

• Best 1-d representation of data

Problem: Hard to compute

• In particular: to solve optimization problem

Wang and Marron (2007)

• Maximum 4 vessel trees

• Hard to tackle serious trees

(e.g. blood vessel trees)

Page 39: Object Orie’d Data Analysis, Last Time

Strongly Non-Euclidean Spaces

PCA on Tree Space: Treeline

Problem: Hard to compute

Solution: Burḉu Aydin & Gabor Pataki

(linear time algorithm)

(based on clever

“reformulation” of problem)

Description coming in Participant Presentation

Page 40: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

PCA on Tree Space: Treelines Interesting to compare:• Population of Left Trees• Population of Right Trees• Population of Back TreesAnd to study 1st, 2nd, 3rd & 4th treelines

Page 41: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Study

“Directions”

1, 2, 3, 4

For sub-

populations

B, L, R

(interpret

later)

Page 42: Object Orie’d Data Analysis, Last Time

Strongly Non-Euclidean Spaces

PCA on Tree Space: Treeline

Next represent data as projections

• Define as closest point in tree line

(same as Euclidean PCA)

• Have corresponding score

(length of projection along line)

• And analog of residual

(distance from data point to projection)

Page 43: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree dataRaw Data & Treelines, PC1, PC2, PC3:

Page 44: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree dataRaw Data & Treelines, PC1, PC2, PC3:

Projections, Scores, Residuals

Page 45: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree dataRaw Data & Treelines, PC1, PC2, PC3:

Cumulative Scores, Residuals

Page 46: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Now look

deeper at

“Directions”

1, 2, 3, 4

For sub-

populations

B, L, R

Page 47: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Notes on Treeline Directions:• PC1 always to left• BACK has most variation to right

(PC2)• LEFT has more varia’n to 2nd level

(PC2)• RIGHT has more var’n to 1st level

(PC2)See these in the data?

Page 48: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Notes:PC1 - leftBACK - right LEFT 2nd levRIGHT 1st levSee these??

Page 49: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Individual (each PC separately) Scores Plot

Page 50: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Identify this person

Page 51: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Identify this person PC Scores: 8,9,3,5

Page 52: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Identify this person (PC Scores: 8,9,3,5):

Red = older

0 = Female

Note:

color ~ age

Page 53: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Identify this person (PC Scores: 8,9,3,5):

Red = older

0 = Female

Note:

color ~ age

Page 54: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Identify this person (PC scores 1,10,1,1)

Page 55: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Identify this person (PC scores 1,10,1,1)

Page 56: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Explain strange (low score) correlation?

Page 57: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Explain strange (low score) correlation?

Revisit Treelines:

Small PC1 Score Small PC3 Score

Small PC1 Score Small PC4 Score

Page 58: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Individual (each PC sep’ly) Residuals Plot

Page 59: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Individual (each PC sep’ly) Residuals Plot

• Very strongly correlated

• Shows much variation not explained by PCs

(data are very rich)

• Note age coloring useful

Younger (bluer) more variation

Older (redder) less variation

Page 60: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Important Data Analytic Goals:

• Understand impact of age (colors)

• Understand impact of gender (symbols)

• Understand handedness (too few)

• Understand ethnicity (too few)

See these in PCA?

Page 61: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Data Analytic Goals: Age, Gender

See

these?

No…

Page 62: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Alternate View: Cumulative Scores

Page 63: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Alternate View: Cumulative Scores

• Always below 45 degree line

• Better separation of age or gender?

(doesn’t seem like it)

• This makes it easy to find:

• Best represented case

• Worst represented case

Page 64: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Cum. Scores: Best repr’ed case (10,19,20)

Page 65: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Cum. Scores: Best repr’ed case (10,19,20)

PC1 & PC2

Scores

Very Large

Page 66: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Cum. Scores: Worst repr’ed case (3,5,6)

Page 67: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Cum. Scores: Worst repr’ed case (3,5,6)

Fairly small

tree

Growth in

unusual

directions

Page 68: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Directly study age PC scores

Page 69: Object Orie’d Data Analysis, Last Time

PCA for blood vessel tree data

Directly study age PC scores

• Graphic highlights potential connections

• But no strong correlations

• PC3 is strongest of weak lot

(less young & old for large PC3 score)

Page 70: Object Orie’d Data Analysis, Last Time

Strongly Non-Euclidean Spaces

Overall Impression:

Interesting new OODA Area

Much to be to done:

• Refined PCA

• Alternate tree lines

• Attributes (i.e. go beyond topology)

• Classification / Discrimination (SVM, DWD)

• Other data types (e.g. lung airways…)