CS 2750: Machine Learningkovashka/cs2750_sp17/ml_04_pca.pdf · CS 2750: Machine Learning...

CS 2750: Machine Learning

Dimensionality Reduction

Prof. Adriana KovashkaUniversity of Pittsburgh

January 19, 2017

Plan for today

• Dimensionality reduction – motivation

• Principal Component Analysis (PCA)

• Applications of PCA

• Other methods for dimensionality reduction

Why reduce dimensionality?

• Data may intrinsically live in a lower-dim space

• Too many features and too few data

• Lower computational expense (memory, train/test time)

• Want to visualize the data in a lower-dim space

• Want to use data of different dimensionality

• Input: Data in a high-dim feature space

• Output: Projection of same data into a lower-dim space

• F: high-dim X low-dim X

Slide credit: Erik Sudderth

Some criteria for success

• Find a projection where the data has:

– Low reconstruction error

– High variance of the data

See hand-written notes for how we find the optimal projection

Slide credit: Subhransu Maji

Principal Components Analysis

• http://www.cs.pitt.edu/~kovashka/cs2750_sp17/PCA_demo.m

• http://www.cs.pitt.edu/~kovashka/cs2750_sp17/PCA.m

• Demo with eigenfaces: http://www.cs.ait.ac.th/~mdailey/matlab/

• Covariance matrix is huge (D2 for D pixels)

• But typically # examples N << D

• Simple trick– X is NxD matrix of normalized training data

– Solve for eigenvectors u of XXT instead of XTX

– Then Xu is eigenvector of covariance XTX

– Need to normalize each vector of Xu into unit length

Adapted from Derek Hoiem

Implementation issue

How to pick K?

• One goal can be to pick K such that P% of the variance of the data is preserved, e.g. 90%

• Let Λ = a vector containing the eigenvalues of the covariance matrix

• Total variance can be obtained from entries of Λ

– total_variance = sum(Λ);

• Take as many of these entries as needed

– K = find( cumsum(Λ) / total_variance >= P, 1);

Variance preserved at i-th eigenvalue

Figure 12.4 (a) from Bishop

Application: Face Recognition

Image from cnet.com

Face recognition: once you’ve detected and cropped a face, try to recognize it

Detection Recognition “Sally”

Slide credit: Lana Lazebnik

Typical face recognition scenarios

• Verification: a person is claiming a particular identity; verify whether that is true– E.g., security

• Closed-world identification: assign a face to one person from among a known set

• General identification: assign a face to a known person or to “unknown”

Slide credit: Derek Hoiem

The space of all face images• When viewed as vectors of pixel values, face images are

extremely high-dimensional– 24x24 image = 576 dimensions

– Slow and lots of storage

• But very few 576-dimensional vectors are valid face images

• We want to effectively model the subspace of face images

Representation and reconstruction

• Face x in “face space” coordinates:

• Reconstruction:

µ + w1u1+w2u2+w3u3+w4u4+ …

Recognition w/ eigenfacesProcess labeled training images• Find mean µ and covariance matrix Σ

• Find k principal components (eigenvectors of Σ) u1,…uk

• Project each training image xi onto subspace spanned by principal components: (wi1,…,wik) = (u1

Txi, … , ukTxi)

Given novel image x• Project onto subspace: (w1,…,wk) = (u1

Tx, … , ukTx)

• Classify as closest training face in k-dimensional subspace

M. Turk and A. Pentland,

Face Recognition using Eigenfaces,

CVPR 1991

Slide credit: Alexander Ihler

Plan for today

• Dimensionality reduction – motivation

• Principal Component Analysis (PCA)

• Applications of PCA

• Other methods for dimensionality reduction

• General dimensionality reduction technique

• Preserves most of variance with a much more compact representation– Lower storage requirements (eigenvectors + a few

numbers per face)

– Faster matching

• What are some problems?

PCA limitations• The direction of maximum variance is not

always good for classification

PCA limitations

• PCA preserves maximum variance

• A more discriminative subspace:

Fisher Linear Discriminants

• FLD preserves discrimination

– Find projection that maximizes scatter between classes and minimizes scatter within classes

Poor Projection

Using two classes as example:

Fisher’s Linear Discriminant

Comparison with PCA

Other dimensionality reduction methods

• Non-linear:– Kernel PCA (Schölkopf et al., Neural Computation

– Independent component analysis – Comon, Signal Processing 1994

– LLE (locally linear embedding) – Roweis and Saul, Science 2000

– ISOMAP (isometric feature mapping) – Tenenbaum et al., Science 2000

– t-SNE (t-distributed stochastic neighbor embedding) –van der Maaten and Hinton, JMLR 2008

ISOMAP example

Figure from Carlotta Domeniconi

ISOMAP example

Figure from Carlotta Domeniconi

t-SNE example

Figure from Genevieve Patterson, IJCV 2014

t-SNE example

Thomas and Kovashka, CVPR 2016

t-SNE example

Thomas and Kovashka, CVPR 2016

CS 2750: Machine Learningkovashka/cs2750_sp17/ml_04_pca.pdf · CS 2750: Machine Learning...

Documents

Fleck 2750 Service Manual - Pure Aqua, Inc

Projector Manual 2750

CS 2750 Machine Learning Lecture 19

1ZSE 2750-105 en Rev 8.indd

CS 2750: Machine Learning Review (cont’d)

(goek) 1zse 2750-106 en rev 4

CS 2750: Machine Learningkovashka/cs2750_sp17/ml_02... · 2017-01-31 · Announcement • TA won’t be back until the last week of January ... • Multiplication example: Fei-Fei

Jamshedpur Research Review -7th Issue(ISSN 2320-2750)

BIKE OF THE YEAR 2016 £2000-£2750

CS 1675: Intro to Machine Learningkovashka/cs1675_fa18/ml_02_clustering.… · –Hierarchical clustering (start with all points in separate clusters and merge) • The mean shift

2750 RS Mechs

CS 2750: Machine Learningpeople.cs.pitt.edu/~kovashka/cs2750_sp17/ml_19_other.pdf · Tzeng et al., “Simultaneous Deep Transfer Across Domains and Tasks”, ICCV 2015. Invariant

1ZSE 2750-106 kr, Rev. 5 - SF 부싱 타입 안내서 · 2018. 5. 9. · 1ZSE 2750-106 kr, Rev. 5. ... 8 기술 안내서 GOEK | 1ZSE 2750-106 kr, 개정판 5 ... 1050 300 표준

1 August, 2001 2750/2754 Product Introduction 3-2750 pH/ORP Sensor * Electrodes sold separately

2750 sq. ft. Office on Rent Prahladnagar, Ahmedabad

World Regional Geography GEOG 2750

Model 2750 Downflow Installation Instructions

1ZSE 2750-105

2750 515-150 en

RELI 2750 A1