Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
3D Geometry forComputer Graphics
Lesson 2: PCA & SVD
2
Last week - eigendecomposition
A
We want to learn how the matrix A works:
3
Last week - eigendecomposition
A
If we look at arbitrary vectors, it doesn’t tell us much.
4
Spectra and diagonalization
A
If A is symmetric, the eigenvectors are orthogonal (and there’s always an eigenbasis).
A = UΛUT
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
nλ
λλ
2
1
Λ= Aui = λi ui
5
Why SVD…
Diagonalizable matrix is essentially a scaling.Most matrices are not diagonalizable – they do other things along with scaling (such as rotation)
So, to understand how general matrices behave, only eigenvalues are not enoughSVD tells us how general linear transformations behave, and other things…
6
The plan for today
First we’ll see some applications of PCA –Principal Component Analysis that uses spectral decomposition.Then look at the theory.SVD
Basic intuitionFormal definitionApplications
7
PCA finds an orthogonal basis that best represents given data set.
The sum of distances2 from the x’ axis is minimized.
PCA – the general idea
x
y
x’
y’
8
PCA – the general idea
PCA finds an orthogonal basis that best represents given data set.
PCA finds a best approximating plane (again, in terms of Σdistances2)
3D point set instandard basis
x y
z
9
Application: finding tight bounding box
An axis-aligned bounding box: agrees with the axes
xminX maxX
y
maxY
minY
10
Application: finding tight bounding box
Oriented bounding box: we find better axes!
x’y’
11
Application: finding tight bounding box
This is not the optimal bounding box
x y
z
12
Application: finding tight bounding box
Oriented bounding box: we find better axes!
13
Usage of bounding boxes (bounding volumes)
Serve as very simple “approximation” of the objectFast collision detection, visibility queriesWhenever we need to know the dimensions (size) of the object
The models consist of thousands of polygonsTo quickly test that they don’t intersect, the bounding boxes are testedSometimes a hierarchy of BB’s is used
The tighter the BB – the less “false alarms” we have
Sample
14
Scanned meshes
15
Point clouds
Scanners give us raw point cloud dataHow to compute normals to shade the surface?
normal
16
Point clouds
Local PCA, take the third vector
17
Notations
Denote our data points by x1, x2, …, xn ∈ Rd
11 11 2
22 21 2
1 2
, , ,
n
n
dd dn
xx xxx x
xx x
⎛ ⎞⎛ ⎞ ⎛ ⎞⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟= = = ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
1 2 nx x x…
18
The origin is zero-order approximation of our data set (a point)It will be the center of mass:
It can be shown that:
The origin of the new axes
1
1
n
ini=
= ∑m x
2
1
argminn
ii=
= ∑x
m x - x
19
Scatter matrix
Denote yi = xi – m, i = 1, 2, …, n
where Y is d×n matrix with yk as columns (k = 1, 2, …, n)
TS YY=
1 1 1 1 21 2 1 1 12 2 2 1 21 2 2 2 2
1 21 2
dn
dn
d d d dn n n n
y y y y y yy y y y y y
S
y y y y y y
⎛ ⎞ ⎛ ⎞⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟= ⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠
Y YT
20
Variance of projected points
In a way, S measures variance (= scatterness) of the data in different directions.Let’s look at a line L through the center of mass m, and project our points xi onto it. The variance of the projected points x’i is:
Original set Small variance Large variance
21
1
var( ) || ||n
ini
L=
′= −∑ x m
L L L L
21
Variance of projected points
Given a direction v, ||v|| = 1, the projection of xionto L = m + vt is:
|| || , / || || , Ti i i i′ − = < − > = < > =x m v x m v v y v y
vm
xi
x’iL
22
Variance of projected points
So,
2 2 21 1 1
1 1
21 1 1 1 1
var( ) || || ( ) || ||
|| || , ,
n n
i in n ni i
T T Tn n n n n
L Y
Y Y Y YY S S= =
′= = = =
= = < > = = = < >
∑ ∑ T T
T T T
x -m v y v
v v v v v v v v v
( ) ( )
2 21 1 1 11 2
2 2 2 22 1 2 1 2 21 2
1 1
1 2
( ) || ||
i nn n
T d di n
i id d d di n
y y y yy y y y
v v v v v v Y
y y y y= =
⎛ ⎞⎛ ⎞ ⎛ ⎞⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟= = =⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎝ ⎠ ⎝ ⎠⎝ ⎠
∑ ∑ Tiv y v
23
Directions of maximal variance
So, we have: var(L) = <Sv, v>Theorem: Let f : {v ∈ Rd | ||v|| = 1} → R,
f (v) = <Sv, v> (and S is a symmetric matrix).
Then, the extrema of f are attained at the eigenvectors of S.
So, eigenvectors of S are directions of maximal/minimal variance!
24
Summary so far
We take the centered data vectors y1, y2, …, yn ∈ Rd
Construct the scatter matrixS measures the variance of the data pointsEigenvectors of S are directions of maximal variance.
TS YY=
25
Scatter matrix - eigendecomposition
S is symmetric⇒ S has eigendecomposition: S = VΛVT
S = v2v1 vd
λ1λ2
λd
v2
v1
vd
The eigenvectors formorthogonal basis
26
Principal components
Eigenvectors that correspond to big eigenvalues are the directions in which the data has strong components (= large variance).If the eigenvalues are more or less the same –there is no preferable direction.
Note: the eigenvalues are always non-negative.
27
Principal components
There’s no preferable directionS looks like this:
Any vector is an eigenvector
TV Vλ
λ⎛ ⎞⎜ ⎟⎝ ⎠
There is a clear preferable directionS looks like this:
μ is close to zero, much smaller than λ.
TVV ⎟⎟⎠
⎞⎜⎜⎝
⎛μ
λ
28
How to use what we got
For finding oriented bounding box – we simply compute the bounding box with respect to the axes defined by the eigenvectors. The origin is at the mean point m.
v2v1
v3
29
For approximation
x
yv1
v2
x
y
This line segment approximates the original data set
The projected data set approximates the original data set
x
y
30
For approximation
In general dimension d, the eigenvalues are sorted in descending order:
λ1 ≥ λ2 ≥ … ≥ λd
The eigenvectors are sorted accordingly.To get an approximation of dimension d’ < d, we take the d’ first eigenvectors and look at the subspace they span (d’ = 1 is a line, d’ = 2 is a plane…)
31
For approximation
To get an approximating set, we project the original data points onto the chosen subspace:xi = m + α1v1 + α2v2 +…+ αd’vd’ +…+αdvd
Projection:
xi’ = m + α1v1 + α2v2 +…+ αd’vd’ +0⋅vd’+1+…+ 0⋅ vd
SVD
33
Geometric analysis of linear transformations
We want to know what a linear transformation A does Need some simple and “comprehendible” representation of the matrix of A. Let’s look what A does to some vectors
Since A(αv) = αA(v), it’s enough to look at vectors v of unit length
A
34
The geometry of linear transformations
A linear (non-singular) transform A always takes hyper-spheres to hyper-ellipses.
A
A
35
The geometry of linear transformations
Thus, one good way to understand what A does is to find which vectors are mapped to the “main axes” of the ellipsoid.
A
A
36
Geometric analysis of linear transformations
If we are lucky: A = V Λ VT, V orthogonal (true if A is symmetric)The eigenvectors of A are the axes of the ellipse
A
37
Symmetric matrix: eigen decomposition
In this case A is just a scaling matrix. The eigendecomposition of A tells us which orthogonal axes it scales, and by how much:
[ ] [ ]1
21 2 1 2
Tn n
n
A
λλ
λ
⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦
v v v v v v… …
A
11
λ2
λ1
iiiA vv λ=
38
General linear transformations: SVD
In general A will also contain rotations, not just scales:
A
[ ] [ ]1
21 2 1 2
Tn n
n
A
σσ
σ
⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦
u u u v v v… …
1 1 σ2σ1
TA U V= ∑
39
General linear transformations: SVD
A
[ ] [ ]1
21 2 1 2n n
n
A
σσ
σ
⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦
v v v u u u… …
1 1 σ2σ1
AV U= ∑
, 0i iA σ σ= ≥i iv u
orthonormal orthonormal
40
SVD more formally
SVD exists for any matrixFormal definition:
For square matrices A ∈ Rn×n, there exist orthogonal matrices U, V ∈ Rn×n and a diagonal matrix Σ, such that all the diagonal values σi of Σ are non-negative and
TA U V= Σ
=
A U Σ TV
41
SVD more formally
The diagonal values of Σ (σ1, …, σn) are called the singular values. It is accustomed to sort them: σ1 ≥ σ2≥ … ≥ σn
The columns of U (u1, …, un) are called the left singular vectors. They are the axes of the ellipsoid.The columns of V (v1, …, vn) are called the right singular vectors. They are the preimages of the axes of the ellipsoid.
TA U V= Σ
=
A U Σ TV
42
Reduced SVD
For rectangular matrices, we have two forms of SVD. The reduced SVD looks like this:
The columns of U are orthonormalCheaper form for computation and storage
A U Σ TV
=
43
Full SVD
We can complete U to a full orthogonal matrix and pad Σ by zeros accordingly
A U Σ TV
=
44
Some history
SVD was discovered by the following people:
E. Beltrami(1835 − 1900)
M. Jordan(1838 − 1922)
J. Sylvester(1814 − 1897)
H. Weyl(1885-1955)
E. Schmidt(1876-1959)
45
SVD is the “working horse” of linear algebra
There are numerical algorithms to compute SVD. Once you have it, you have many things:
Matrix inverse → can solve square linear systemsNumerical rank of a matrixCan solve least-squares systemsPCAMany more…
46
Matrix inverse and solving linear systems
Matrix inverse:
So, to solve
( ) ( )1
1 11 1 1
1
1n
T
T T
T
A U V
A U V V U
V Uσ
σ
− −− − −
= ∑
= ∑ = ∑ =
⎡ ⎤⎢ ⎥
= ⎢ ⎥⎢ ⎥⎣ ⎦
1 T
AV U−
=
= ∑
x bx b
47
Matrix rank
The rank of A is the number of non-zero singular values
A U Σ TV
=m
n
σ1σ2
σn
48
Numerical rank
If there are very small singular values, then A is close to being singular. We can set a threshold t, so that numeric_rank(A) = #{σi| σi > t}
If rank(A) < n then A is singular. It maps the entire space Rn onto some subspace, like a plane (so Ais some sort of projection).
49
Back to PCA
We wanted to find principal components
x
yx’
y’
50
PCA
Move the center of mass to the originpi’ = pi − m
x
y
x’y’
51
PCA
Constructed the matrix X of the data points.
The principal axes are eigenvectors of S = XXT
1 2
| | |
| | |nX
⎡ ⎤⎢ ⎥′ ′ ′= ⎢ ⎥⎢ ⎥⎣ ⎦
p p p
1T T
d
XX S U Uλ
λ
⎡ ⎤⎢ ⎥= = ⎢ ⎥⎢ ⎥⎣ ⎦
52
PCA
We can compute the principal components by SVD of X:
Thus, the left singular vectors of X are the principal components! We sort them by the size of the singular values of X.
2
( )T
T
T T T T
T TT
X U VXX U V U V
VU U UV U
= Σ
= Σ Σ =
= =Σ Σ Σ
53
Shape matching
We have two objects in correspondenceWant to find the rigid transformation that aligns them
54
Shape matching
When the objects are aligned, the lengths of the connecting lines are small.
55
Shape matching – formalization
Align two point sets
Find a translation vector t and rotation matrix Rso that:
{ } { }1 1, , and , , .n nP Q= =p p q q… …
2
1( ) is minimized
n
i ii
R=
− +∑ p tq
56
Shape matching – solution
Turns out we can solve the translation and rotation separately.Theorem: if (R, t) is the optimal transformation, then the points {pi} and {Rqi + t} have the same centers of mass.
1
1 n
iin =
= ∑p p1
1 n
iin =
= ∑q q
( )1
1
1
1
n
ii
n
ii
R
R
n
R Rn
=
=
= +
⇓
⎛ ⎞= + = +⎜ ⎟⎝ ⎠
= −
∑
∑
p q
p q q t
t p
t
t
q
57
Finding the rotation R
To find the optimal R, we bring the centroids of both point sets to the origin:
We want to find R that minimizes
i i i i′ ′= − = −p p p q q q
2
1
n
i ii
R=
′ ′−∑ p q
58
Finding the rotation R
( ) ( )
( )
2
1 1
1
n nT
i i i i i ii i
nT T T T TT
i i i i i i i ii
R R R
R R R R
= =
=
′ ′ ′ ′ ′ ′− = − − =
′ ′ ′ ′ ′ ′ ′ ′= − − +
∑ ∑
∑
p q p q p q
p p p q q p q qI
These terms do not depend on R,so we can ignore them in the minimization
59
Finding the rotation R
( ) ( )
( )
( ) ( )
1 1
1 1
min max .
max 2 max
n nT T T T
i i i ii i i i iii i
TT T Ti ii i i i i
n nT
T T
T T
Ti i i i
i i
R R R R
R R R
R R
= =
= =
′ ′ ′ ′ ′ ′ ′ ′− − = +
′ ′ ′ ′ ′ ′= =
′ ′ ′ ′⇒ =
∑ ∑
∑ ∑
p q q p p q q p
q p q p p q
p q p q
this is a scalar
60
Finding the rotation R
( )1 1 1
1
n n nT T T
i i i i i ii i i
nT
i ii
Trace Trace
H
R R R= = =
=
⎛ ⎞ ⎛ ⎞′ ′ ′ ′ ′ ′= =⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠
′ ′=
∑ ∑ ∑
∑
p q q p q p
q p
3 ×1 1 ×3 = 3 ×3
=
1( )
n
iii
Trace A A=
= ∑
61
So, we want to find R that maximizes
Theorem: if M is symmetric positive definite (all eigenvalues of M are positive) and B is any orthogonal matrix then
So, let’s find R so that RH is symmetric positive definite. Then we know for sure that Trace(RH) is maximal.
Finding the rotation R
( )Trace RH
( ) ( )Trace M Trace BM≥
62
This is easy! Compute SVD of H:
Define R:Check RH:
TH U V
Finding the rotation R
= ΣTR VU=
( )( )T T TVU U VR V VH = Σ = Σ
This is a symmetric matrix,Its eigenvalues are σi ≥ 0So RH is positive!
63
Summary of rigid alignment:
Translate the input points to the centroids:
Compute the “covariance matrix”
Compute the SVD of H:
The optimal rotation is
The translation vector is
i ii i′′ = =− −qp p qp q
1
nT
i ii
H=
′ ′= ∑ q p
TH U V= Σ
TR VU=
R= −t p q
64
Complexity
Numerical SVD is an expensive operationWe always need to pay attention to the dimensions of the matrix we’re applying SVD to.
65
Small somewhat related example
The End