Lecture: Face Recognition and Feature Reductioncs131.stanford.edu/files/12_svd_PCA.pdf · Stanford...

Lecture 11 -Stanford University

Lecture:FaceRecognitionandFeatureReduction

JuanCarlosNiebles andRanjayKrishnaStanfordVisionandLearningLab

2-Nov-171

Recap- Curseofdimensionality• Assume5000pointsuniformlydistributedintheunit

hypercubeandwewanttoapply5-NN.Supposeourquerypointisattheorigin.– In1-dimension,wemustgoadistanceof5/5000=0.001onthe

averagetocapture5nearestneighbors.– In2dimensions,wemustgotogetasquarethatcontains0.001

ofthevolume.– Inddimensions,wemustgo

31-Oct-172

0.001( )1/d

Whatwewilllearntoday• Singularvaluedecomposition• PrincipalComponentAnalysis(PCA)• Imagecompression

2-Nov-173

Whatwewilllearntoday• Singularvaluedecomposition• PrincipalComponentAnalysis(PCA)• Imagecompression

2-Nov-174

SingularValueDecomposition(SVD)

• Thereareseveralcomputeralgorithmsthatcan“factorize”amatrix,representingitastheproductofsomeothermatrices

• ThemostusefuloftheseistheSingularValueDecomposition.

• RepresentsanymatrixA asaproductofthreematrices:UΣVT

• Pythoncommand:– [U,S,V]= numpy.linalg.svd(A)

2-Nov-175

UΣVT =A• WhereU andV arerotationmatrices,andΣisascalingmatrix.Forexample:

2-Nov-176

SingularValueDecomposition(SVD)• Beyond2x2matrices:

– Ingeneral,ifA ism xn,thenU willbemxm, Σ willbem xn,andVT willben xn.

– (Notethedimensionsworkouttoproducem xnaftermultiplication)

2-Nov-177

• U andV arealwaysrotationmatrices.– Geometricrotationmaynotbeanapplicableconcept,dependingonthematrix.Sowecallthem“unitary”matrices– eachcolumnisaunitvector.

• Σisadiagonalmatrix– Thenumberofnonzeroentries=rankofA– Thealgorithmalwayssortstheentrieshightolow

2-Nov-178

SVDApplications

• We’vediscussedSVDintermsofgeometrictransformationmatrices

• ButSVDofanimagematrixcanalsobeveryuseful

• Tounderstandthis,we’lllookatalessgeometricinterpretationofwhatSVDisdoing

2-Nov-179

SVDApplications

• Lookathowthemultiplicationworksout,lefttoright:• Column1ofU getsscaledbythefirstvaluefromΣ.

• Theresultingvectorgetsscaledbyrow1ofVT toproduceacontributiontothecolumnsofA

2-Nov-1710

SVDApplications

• Eachproductof(columni ofU)·(valuei fromΣ)·(rowi ofVT)producesacomponentofthefinalA.

2-Nov-1711

SVDApplications

• We’rebuildingAasalinearcombinationofthecolumnsof U• UsingallcolumnsofU,we’llrebuildtheoriginalmatrixperfectly• But,inreal-worlddata,oftenwecanjustusethefirstfew

columnsofU andwe’llgetsomethingclose(e.g.thefirstApartial,above)

2-Nov-1712

SVDApplications

• Wecancallthosefirstfewcolumnsof UthePrincipalComponents ofthedata

• Theyshowthemajorpatternsthatcanbeaddedtoproducethecolumnsoftheoriginalmatrix

• TherowsofVT showhowtheprincipalcomponents aremixedtoproducethecolumnsofthematrix

2-Nov-1713

SVDApplications

WecanlookatΣtoseethatthefirstcolumnhasalargeeffect

2-Nov-1714

whilethesecondcolumnhasamuchsmallereffectinthisexample

SVDApplications

• Forthisimage,usingonlythefirst10 of300principalcomponentsproducesarecognizablereconstruction

• So,SVDcanbeusedforimagecompression

2-Nov-1715

SVDforsymmetricmatrices

• IfAisasymmetricmatrix,itcanbedecomposedasthefollowing:

• ComparedtoatraditionalSVDdecomposition,U=VT andisanorthogonalmatrix.

2-Nov-1716

PrincipalComponentAnalysis

• Remember,columnsof UarethePrincipalComponents ofthedata:themajorpatternsthatcanbeaddedtoproducethecolumnsoftheoriginalmatrix

• Oneuseofthisistoconstructamatrixwhereeachcolumnisaseparatedatasample

• RunSVDonthatmatrix,andlookatthefirstfewcolumnsofUtoseepatternsthatarecommonamongthecolumns

• ThisiscalledPrincipalComponentAnalysis (orPCA)ofthedatasamples

2-Nov-1717

PrincipalComponentAnalysis

• Often,rawdatasampleshavealotofredundancyandpatterns• PCAcanallowyoutorepresentdatasamplesasweightsonthe

principalcomponents,ratherthanusingtheoriginalrawformofthedata

• Byrepresentingeachsampleasjustthoseweights,youcanrepresentjustthe“meat”ofwhat’sdifferentbetweensamples.

• Thisminimalrepresentationmakesmachinelearningandotheralgorithmsmuchmoreefficient

2-Nov-1718

HowisSVDcomputed?

• Forthisclass:tellPYTHONtodoit.Usetheresult.

• But,ifyou’reinterested,onecomputeralgorithmtodoitmakesuseofEigenvectors!

2-Nov-1719

Eigenvectordefinition

• SupposewehaveasquarematrixA.Wecansolveforvectorxandscalarλ suchthatAx= λx

• Inotherwords,findvectorswhere,ifwetransformthemwithA,theonlyeffectistoscalethemwithnochangeindirection.

• Thesevectorsarecalledeigenvectors(Germanfor“selfvector”ofthematrix),andthescalingfactorsλ arecalledeigenvalues

• Anm xmmatrixwillhave≤m eigenvectorswhereλ isnonzero

2-Nov-1720

Findingeigenvectors• ComputerscanfindanxsuchthatAx= λxusingthisiterativealgorithm:

– X=randomunitvector– while(xhasn’tconverged)

• X=Ax• normalizex

• xwillquicklyconvergetoaneigenvector• Somesimplemodificationswillletthisalgorithmfindalleigenvectors

2-Nov-1721

FindingSVD

• Eigenvectorsareforsquarematrices,butSVDisforallmatrices

• Todosvd(A),computerscandothis:– TakeeigenvectorsofAAT(matrixisalwayssquare).

• TheseeigenvectorsarethecolumnsofU.• Squarerootofeigenvalues arethesingularvalues(theentriesofΣ).

– TakeeigenvectorsofATA(matrixisalwayssquare).• TheseeigenvectorsarecolumnsofV (orrowsofVT)

2-Nov-1722

FindingSVD

• Moralofthestory:SVDisfast,evenforlargematrices• It’susefulforalotofstuff• TherearealsootheralgorithmstocomputeSVDorpartof

theSVD– Python’snp.linalg.svd()commandhasoptionstoefficiently

computeonlywhatyouneed,ifperformancebecomesanissue

2-Nov-1723

AdetailedgeometricexplanationofSVDishere:http://www.ams.org/samplings/feature-column/fcarc-svd

Whatwewilllearntoday• Introductiontofacerecognition• PrincipalComponentAnalysis(PCA)• Imagecompression

2-Nov-1724

Covariance• VarianceandCovarianceareameasureofthe“spread”ofasetofpointsaroundtheircenterofmass(mean)

• Variance– measureofthedeviationfromthemeanforpointsinonedimensione.g.heights

• Covarianceasameasureofhowmucheachofthedimensionsvaryfromthemeanwithrespecttoeachother.

• Covarianceismeasuredbetween2dimensionstoseeifthereisarelationshipbetweenthe2dimensionse.g.numberofhoursstudied&marksobtained.

• Thecovariancebetweenonedimensionanditselfisthevariance

2-Nov-1725

Covariance

• So,ifyouhada3-dimensionaldataset(x,y,z),thenyoucouldmeasurethecovariancebetweenthexandydimensions,theyandzdimensions,andthexandzdimensions.Measuringthecovariancebetweenxandx,oryandy,orzandzwouldgiveyouthevarianceofthex,yandzdimensionsrespectively

2-Nov-1726

Covariancematrix• RepresentingCovariancebetweendimensionsasamatrixe.g.for3dimensions

• Diagonalisthevariancesofx,yandz• cov(x,y)=cov(y,x)hencematrixissymmetricalaboutthediagonal

• N-dimensionaldatawillresultinNxN covariancematrix

2-Nov-1727

Covariance

• Whatistheinterpretationofcovariancecalculations?– e.g.:2dimensionaldataset– x:numberofhoursstudiedforasubject– y:marksobtainedinthatsubject– covariancevalueissay:104.53– whatdoesthisvaluemean?

2-Nov-1728

Covarianceinterpretation

2-Nov-1729

Covarianceinterpretation• Exactvalueisnotasimportantasit’ssign.• Apositivevalueofcovarianceindicatesbothdimensionsincreaseordecreasetogethere.g.asthenumberofhoursstudiedincreases,themarksinthatsubjectincrease.

• Anegativevalueindicateswhileoneincreasestheotherdecreases,orvice-versae.g.activesociallifeatPSUvsperformanceinCSdept.

• Ifcovarianceiszero:thetwodimensionsareindependentofeachothere.g.heightsofstudentsvsthemarksobtainedinasubject

2-Nov-1730

Exampledata

2-Nov-1731

Covariancebetweenthetwoaxisishigh.Canwereducethenumberofdimensionstojust1?

GeometricinterpretationofPCA

2-Nov-1732

GeometricinterpretationofPCA

• Let’ssaywehaveasetof2Ddatapointsx.Butweseethatallthepointslieonalinein2D.

• So,2dimensionsareredundanttoexpressthedata.Wecanexpressallthepointswithjustonedimension.

2-Nov-1733

1Dsubspacein2D

PCA:PrincipleComponentAnalysis

• Givenasetofpoints,howdoweknowiftheycanbecompressedlikeinthepreviousexample?– Theansweristolookintothecorrelationbetweenthepoints

– ThetoolfordoingthisiscalledPCA

2-Nov-1734

PCAFormulation• Basicidea:

– Ifthedatalivesinasubspace,itisgoingtolookveryflatwhenviewedfromthefullspace,e.g.

2-Nov-1735

SlideinspiredbyN.Vasconcelos

1Dsubspacein2D 2Dsubspacein3D

PCAFormulation• AssumexisGaussianwith

covarianceΣ.

• Recallthatagaussian isdefinedwithit’smeanandvariance:

• Recallthat μ andΣ ofagaussian aredefinedas:

2-Nov-1736

λ1λ2

PCAformulation

• Sincegaussians aresymmetric,it’scovariancematrixisalsoasymmetricmatrix.Sowecanexpressitas:– Σ = UΛUT = UΛ1/2(UΛ1/2)T

2-Nov-1737

PCAFormulation• IfxisGaussianwithcovarianceΣ,

– Principalcomponentsφi aretheeigenvectorsofΣ– Principallengthsλi aretheeigenvaluesofΣ

• bycomputingtheeigenvaluesweknowthedatais– Notflatifλ1 ≈λ2– Flatifλ1 >>λ2

2-Nov-1738

λ1λ2

PCAAlgorithm(training)

2-Nov-1739

PCAAlgorithm(testing)

2-Nov-1740

PCAbySVD• Analternativemannertocomputetheprincipalcomponents,

basedonsingularvaluedecomposition• Quickreminder:SVD

– Anyrealnxmmatrix(n>m)canbedecomposedas

– WhereMisan(nxm)columnorthonormalmatrixofleftsingularvectors(columnsofM)

– Π isan(mxm)diagonalmatrixofsingularvalues– NT isan(mxm)roworthonormalmatrixofrightsingularvectors

(columnsofN)

2-Nov-1741

PCAbySVD• TorelatethistoPCA,weconsiderthedatamatrix

• Thesamplemeanis

2-Nov-1742

PCAbySVD• CenterthedatabysubtractingthemeantoeachcolumnofX• Thecentereddatamatrix is

2-Nov-1743

PCAbySVD• Thesamplecovariance matrixis

wherexic istheith columnofXc• Thiscanbewrittenas

2-Nov-1744

PCAbySVD• Thematrix

isreal(nxd).Assumingn>dithasSVDdecomposition

2-Nov-1745

PCAbySVD

• NotethatNis(dxd)andorthonormal,andΠ2 isdiagonal.ThisisjusttheeigenvaluedecompositionofΣ

• Itfollowsthat– TheeigenvectorsofΣ arethecolumnsofN– TheeigenvaluesofΣ are

• ThisgivesanalternativealgorithmforPCA

2-Nov-1746

PCAbySVD• Insummary,computationofPCAbySVD• GivenXwithoneexamplepercolumn

– Createthecentereddatamatrix

– ComputeitsSVD

– PrincipalcomponentsarecolumnsofN,eigenvaluesare

2-Nov-1747

RuleofthumbforfindingthenumberofPCAcomponents

• Anaturalmeasureistopicktheeigenvectorsthatexplainp%ofthedatavariability– Canbedonebyplottingtheratiork asafunctionofk

– E.g.weneed3eigenvectorstocover70%ofthevariabilityofthisdataset

2-Nov-1748

Whatwewilllearntoday• Introductiontofacerecognition• PrincipalComponentAnalysis(PCA)• Imagecompression

2-Nov-1749

OriginalImage

• Divide the original 372x492 image into patches:• Each patch is an instance that contains 12x12 pixels on a grid

• View each as a 144-D vector2-Nov-1750

L2 errorandPCAdim

2-Nov-1751

PCAcompression:144D) 60D

2-Nov-1752

2-Nov-1753

16mostimportanteigenvectors

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2-Nov-1754

2-Nov-1755

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

6mostimportanteigenvectors

2-Nov-1756

2-Nov-1757

2 4 6 8 10 12

122 4 6 8 10 12

2 4 6 8 10 12

3 most important eigenvectors

2-Nov-1758

2-Nov-1759

Whatwehavelearnedtoday• Introductiontofacerecognition• PrincipalComponentAnalysis(PCA)• Imagecompression

2-Nov-1760

Lecture: Face Recognition and Feature Reductioncs131.stanford.edu/files/12_svd_PCA.pdf · Stanford...

Documents

Local feature descriptors for visual recognition

Feature-Based Facial Expression Recognition: … · Feature-Based Facial Expression Recognition: Sensitivity Analysis and Experiments With a Multi-Layer Perceptron ... Feature-Based

Feature spaces and recognition methods

Visual Recognition using Embedded Feature Selection for …papers.nips.cc/paper/4799-visual-recognition-using-embedded-featur… · Visual Recognition using Embedded Feature Selection

Feature Extraction Comparison in Handwriting Recognition

Statistical Feature Recognition for Multidimensional Solar

FEATURE RECOGNITION BERBASIS CORNER DETECTION …

Feature Decorrelation on Speech Recognition

Feature Detection andMatching - NUS Computingcs4243/lecture/feature.pdf · Feature Detection andMatching CS4243 Computer Vision and Pattern Recognition LeowWeeKheng DepartmentofComputerScience

Feature Recognition Article

FINGERPRINT RECOGNITION USING FEATURE EXTRACTION IZNI

FEATURE BASED MODULATION RECOGNITION FOR INTRAPULSE ...etd.lib.metu.edu.tr/upload/12607676/index.pdf · FEATURE BASED MODULATION RECOGNITION FOR INTRAPULSE MODULATIONS Çevik, Gözde

Location Recognition using Prioritized Feature Matching

Lecture: Face Recognition and Feature Reductionvision.stanford.edu/teaching/cs131_fall1718/files/12_svd_PCA.pdf · Stanford University Lecture 11 - Principal Component Analysis •

Bengali Character Recognition using feature extraction

Feature recognition and classification

Image Analysis Lecture 9.3 -Introduction to Machine Learning · Lecture 9.3 -Introduction to Machine Learning Idar Dyrdal. Machine learning (Pattern recognition) ... Feature vector

Feature based hex meshing methodology: feature recognition and volume

Discriminative Feature Optimization for Speech Recognition

EEG-based Automatic Emotion Recognition: Feature ... · EEG-based Automatic Emotion Recognition: Feature Extraction, Selection and Classiﬁcation Methods Pascal Ackermann , Christian