Lecture: Face Recognition and Feature Reductioncs131.stanford.edu/files/12_svd_PCA.pdf · Stanford...

Preview:

Citation preview

Lecture 11 -Stanford University

Lecture:FaceRecognitionandFeatureReduction

JuanCarlosNiebles andRanjayKrishnaStanfordVisionandLearningLab

2-Nov-171

Lecture 11 -Stanford University

Recap- Curseofdimensionality• Assume5000pointsuniformlydistributedintheunit

hypercubeandwewanttoapply5-NN.Supposeourquerypointisattheorigin.– In1-dimension,wemustgoadistanceof5/5000=0.001onthe

averagetocapture5nearestneighbors.– In2dimensions,wemustgotogetasquarethatcontains0.001

ofthevolume.– Inddimensions,wemustgo

31-Oct-172

0.001

0.001( )1/d

Lecture 11 -Stanford University

Whatwewilllearntoday• Singularvaluedecomposition• PrincipalComponentAnalysis(PCA)• Imagecompression

2-Nov-173

Lecture 11 -Stanford University

Whatwewilllearntoday• Singularvaluedecomposition• PrincipalComponentAnalysis(PCA)• Imagecompression

2-Nov-174

Lecture 11 -Stanford University

SingularValueDecomposition(SVD)

• Thereareseveralcomputeralgorithmsthatcan“factorize”amatrix,representingitastheproductofsomeothermatrices

• ThemostusefuloftheseistheSingularValueDecomposition.

• RepresentsanymatrixA asaproductofthreematrices:UΣVT

• Pythoncommand:– [U,S,V]= numpy.linalg.svd(A)

2-Nov-175

Lecture 11 -Stanford University

SingularValueDecomposition(SVD)

UΣVT =A• WhereU andV arerotationmatrices,andΣisascalingmatrix.Forexample:

2-Nov-176

Lecture 11 -Stanford University

SingularValueDecomposition(SVD)• Beyond2x2matrices:

– Ingeneral,ifA ism xn,thenU willbemxm, Σ willbem xn,andVT willben xn.

– (Notethedimensionsworkouttoproducem xnaftermultiplication)

2-Nov-177

Lecture 11 -Stanford University

SingularValueDecomposition(SVD)

• U andV arealwaysrotationmatrices.– Geometricrotationmaynotbeanapplicableconcept,dependingonthematrix.Sowecallthem“unitary”matrices– eachcolumnisaunitvector.

• Σisadiagonalmatrix– Thenumberofnonzeroentries=rankofA– Thealgorithmalwayssortstheentrieshightolow

2-Nov-178

Lecture 11 -Stanford University

SVDApplications

• We’vediscussedSVDintermsofgeometrictransformationmatrices

• ButSVDofanimagematrixcanalsobeveryuseful

• Tounderstandthis,we’lllookatalessgeometricinterpretationofwhatSVDisdoing

2-Nov-179

Lecture 11 -Stanford University

SVDApplications

• Lookathowthemultiplicationworksout,lefttoright:• Column1ofU getsscaledbythefirstvaluefromΣ.

• Theresultingvectorgetsscaledbyrow1ofVT toproduceacontributiontothecolumnsofA

2-Nov-1710

Lecture 11 -Stanford University

SVDApplications

• Eachproductof(columni ofU)·(valuei fromΣ)·(rowi ofVT)producesacomponentofthefinalA.

2-Nov-1711

+

=

Lecture 11 -Stanford University

SVDApplications

• We’rebuildingAasalinearcombinationofthecolumnsof U• UsingallcolumnsofU,we’llrebuildtheoriginalmatrixperfectly• But,inreal-worlddata,oftenwecanjustusethefirstfew

columnsofU andwe’llgetsomethingclose(e.g.thefirstApartial,above)

2-Nov-1712

Lecture 11 -Stanford University

SVDApplications

• Wecancallthosefirstfewcolumnsof UthePrincipalComponents ofthedata

• Theyshowthemajorpatternsthatcanbeaddedtoproducethecolumnsoftheoriginalmatrix

• TherowsofVT showhowtheprincipalcomponents aremixedtoproducethecolumnsofthematrix

2-Nov-1713

Lecture 11 -Stanford University

SVDApplications

WecanlookatΣtoseethatthefirstcolumnhasalargeeffect

2-Nov-1714

whilethesecondcolumnhasamuchsmallereffectinthisexample

Lecture 11 -Stanford University

SVDApplications

• Forthisimage,usingonlythefirst10 of300principalcomponentsproducesarecognizablereconstruction

• So,SVDcanbeusedforimagecompression

2-Nov-1715

Lecture 11 -Stanford University

SVDforsymmetricmatrices

• IfAisasymmetricmatrix,itcanbedecomposedasthefollowing:

• ComparedtoatraditionalSVDdecomposition,U=VT andisanorthogonalmatrix.

2-Nov-1716

Lecture 11 -Stanford University

PrincipalComponentAnalysis

• Remember,columnsof UarethePrincipalComponents ofthedata:themajorpatternsthatcanbeaddedtoproducethecolumnsoftheoriginalmatrix

• Oneuseofthisistoconstructamatrixwhereeachcolumnisaseparatedatasample

• RunSVDonthatmatrix,andlookatthefirstfewcolumnsofUtoseepatternsthatarecommonamongthecolumns

• ThisiscalledPrincipalComponentAnalysis (orPCA)ofthedatasamples

2-Nov-1717

Lecture 11 -Stanford University

PrincipalComponentAnalysis

• Often,rawdatasampleshavealotofredundancyandpatterns• PCAcanallowyoutorepresentdatasamplesasweightsonthe

principalcomponents,ratherthanusingtheoriginalrawformofthedata

• Byrepresentingeachsampleasjustthoseweights,youcanrepresentjustthe“meat”ofwhat’sdifferentbetweensamples.

• Thisminimalrepresentationmakesmachinelearningandotheralgorithmsmuchmoreefficient

2-Nov-1718

Lecture 11 -Stanford University

HowisSVDcomputed?

• Forthisclass:tellPYTHONtodoit.Usetheresult.

• But,ifyou’reinterested,onecomputeralgorithmtodoitmakesuseofEigenvectors!

2-Nov-1719

Lecture 11 -Stanford University

Eigenvectordefinition

• SupposewehaveasquarematrixA.Wecansolveforvectorxandscalarλ suchthatAx= λx

• Inotherwords,findvectorswhere,ifwetransformthemwithA,theonlyeffectistoscalethemwithnochangeindirection.

• Thesevectorsarecalledeigenvectors(Germanfor“selfvector”ofthematrix),andthescalingfactorsλ arecalledeigenvalues

• Anm xmmatrixwillhave≤m eigenvectorswhereλ isnonzero

2-Nov-1720

Lecture 11 -Stanford University

Findingeigenvectors• ComputerscanfindanxsuchthatAx= λxusingthisiterativealgorithm:

– X=randomunitvector– while(xhasn’tconverged)

• X=Ax• normalizex

• xwillquicklyconvergetoaneigenvector• Somesimplemodificationswillletthisalgorithmfindalleigenvectors

2-Nov-1721

Lecture 11 -Stanford University

FindingSVD

• Eigenvectorsareforsquarematrices,butSVDisforallmatrices

• Todosvd(A),computerscandothis:– TakeeigenvectorsofAAT(matrixisalwayssquare).

• TheseeigenvectorsarethecolumnsofU.• Squarerootofeigenvalues arethesingularvalues(theentriesofΣ).

– TakeeigenvectorsofATA(matrixisalwayssquare).• TheseeigenvectorsarecolumnsofV (orrowsofVT)

2-Nov-1722

Lecture 11 -Stanford University

FindingSVD

• Moralofthestory:SVDisfast,evenforlargematrices• It’susefulforalotofstuff• TherearealsootheralgorithmstocomputeSVDorpartof

theSVD– Python’snp.linalg.svd()commandhasoptionstoefficiently

computeonlywhatyouneed,ifperformancebecomesanissue

2-Nov-1723

AdetailedgeometricexplanationofSVDishere:http://www.ams.org/samplings/feature-column/fcarc-svd

Lecture 11 -Stanford University

Whatwewilllearntoday• Introductiontofacerecognition• PrincipalComponentAnalysis(PCA)• Imagecompression

2-Nov-1724

Lecture 11 -Stanford University

Covariance• VarianceandCovarianceareameasureofthe“spread”ofasetofpointsaroundtheircenterofmass(mean)

• Variance– measureofthedeviationfromthemeanforpointsinonedimensione.g.heights

• Covarianceasameasureofhowmucheachofthedimensionsvaryfromthemeanwithrespecttoeachother.

• Covarianceismeasuredbetween2dimensionstoseeifthereisarelationshipbetweenthe2dimensionse.g.numberofhoursstudied&marksobtained.

• Thecovariancebetweenonedimensionanditselfisthevariance

2-Nov-1725

Lecture 11 -Stanford University

Covariance

• So,ifyouhada3-dimensionaldataset(x,y,z),thenyoucouldmeasurethecovariancebetweenthexandydimensions,theyandzdimensions,andthexandzdimensions.Measuringthecovariancebetweenxandx,oryandy,orzandzwouldgiveyouthevarianceofthex,yandzdimensionsrespectively

2-Nov-1726

Lecture 11 -Stanford University

Covariancematrix• RepresentingCovariancebetweendimensionsasamatrixe.g.for3dimensions

• Diagonalisthevariancesofx,yandz• cov(x,y)=cov(y,x)hencematrixissymmetricalaboutthediagonal

• N-dimensionaldatawillresultinNxN covariancematrix

2-Nov-1727

Lecture 11 -Stanford University

Covariance

• Whatistheinterpretationofcovariancecalculations?– e.g.:2dimensionaldataset– x:numberofhoursstudiedforasubject– y:marksobtainedinthatsubject– covariancevalueissay:104.53– whatdoesthisvaluemean?

2-Nov-1728

Lecture 11 -Stanford University

Covarianceinterpretation

2-Nov-1729

Lecture 11 -Stanford University

Covarianceinterpretation• Exactvalueisnotasimportantasit’ssign.• Apositivevalueofcovarianceindicatesbothdimensionsincreaseordecreasetogethere.g.asthenumberofhoursstudiedincreases,themarksinthatsubjectincrease.

• Anegativevalueindicateswhileoneincreasestheotherdecreases,orvice-versae.g.activesociallifeatPSUvsperformanceinCSdept.

• Ifcovarianceiszero:thetwodimensionsareindependentofeachothere.g.heightsofstudentsvsthemarksobtainedinasubject

2-Nov-1730

Lecture 11 -Stanford University

Exampledata

2-Nov-1731

Covariancebetweenthetwoaxisishigh.Canwereducethenumberofdimensionstojust1?

Lecture 11 -Stanford University

GeometricinterpretationofPCA

2-Nov-1732

Lecture 11 -Stanford University

GeometricinterpretationofPCA

• Let’ssaywehaveasetof2Ddatapointsx.Butweseethatallthepointslieonalinein2D.

• So,2dimensionsareredundanttoexpressthedata.Wecanexpressallthepointswithjustonedimension.

2-Nov-1733

1Dsubspacein2D

Lecture 11 -Stanford University

PCA:PrincipleComponentAnalysis

• Givenasetofpoints,howdoweknowiftheycanbecompressedlikeinthepreviousexample?– Theansweristolookintothecorrelationbetweenthepoints

– ThetoolfordoingthisiscalledPCA

2-Nov-1734

Lecture 11 -Stanford University

PCAFormulation• Basicidea:

– Ifthedatalivesinasubspace,itisgoingtolookveryflatwhenviewedfromthefullspace,e.g.

2-Nov-1735

SlideinspiredbyN.Vasconcelos

1Dsubspacein2D 2Dsubspacein3D

Lecture 11 -Stanford University

PCAFormulation• AssumexisGaussianwith

covarianceΣ.

• Recallthatagaussian isdefinedwithit’smeanandvariance:

• Recallthat μ andΣ ofagaussian aredefinedas:

2-Nov-1736

x1

x2

λ1λ2

φ1

φ2

Lecture 11 -Stanford University

PCAformulation

• Sincegaussians aresymmetric,it’scovariancematrixisalsoasymmetricmatrix.Sowecanexpressitas:– Σ = UΛUT = UΛ1/2(UΛ1/2)T

2-Nov-1737

Lecture 11 -Stanford University

PCAFormulation• IfxisGaussianwithcovarianceΣ,

– Principalcomponentsφi aretheeigenvectorsofΣ– Principallengthsλi aretheeigenvaluesofΣ

• bycomputingtheeigenvaluesweknowthedatais– Notflatifλ1 ≈λ2– Flatifλ1 >>λ2

2-Nov-1738

SlideinspiredbyN.Vasconcelos

x1

x2

λ1λ2

φ1

φ2

Lecture 11 -Stanford University

PCAAlgorithm(training)

2-Nov-1739

SlideinspiredbyN.Vasconcelos

Lecture 11 -Stanford University

PCAAlgorithm(testing)

2-Nov-1740

SlideinspiredbyN.Vasconcelos

Lecture 11 -Stanford University

PCAbySVD• Analternativemannertocomputetheprincipalcomponents,

basedonsingularvaluedecomposition• Quickreminder:SVD

– Anyrealnxmmatrix(n>m)canbedecomposedas

– WhereMisan(nxm)columnorthonormalmatrixofleftsingularvectors(columnsofM)

– Π isan(mxm)diagonalmatrixofsingularvalues– NT isan(mxm)roworthonormalmatrixofrightsingularvectors

(columnsofN)

2-Nov-1741

SlideinspiredbyN.Vasconcelos

Lecture 11 -Stanford University

PCAbySVD• TorelatethistoPCA,weconsiderthedatamatrix

• Thesamplemeanis

2-Nov-1742

SlideinspiredbyN.Vasconcelos

Lecture 11 -Stanford University

PCAbySVD• CenterthedatabysubtractingthemeantoeachcolumnofX• Thecentereddatamatrix is

2-Nov-1743

SlideinspiredbyN.Vasconcelos

Lecture 11 -Stanford University

PCAbySVD• Thesamplecovariance matrixis

wherexic istheith columnofXc• Thiscanbewrittenas

2-Nov-1744

SlideinspiredbyN.Vasconcelos

Lecture 11 -Stanford University

PCAbySVD• Thematrix

isreal(nxd).Assumingn>dithasSVDdecomposition

and

2-Nov-1745

SlideinspiredbyN.Vasconcelos

Lecture 11 -Stanford University

PCAbySVD

• NotethatNis(dxd)andorthonormal,andΠ2 isdiagonal.ThisisjusttheeigenvaluedecompositionofΣ

• Itfollowsthat– TheeigenvectorsofΣ arethecolumnsofN– TheeigenvaluesofΣ are

• ThisgivesanalternativealgorithmforPCA

2-Nov-1746

SlideinspiredbyN.Vasconcelos

Lecture 11 -Stanford University

PCAbySVD• Insummary,computationofPCAbySVD• GivenXwithoneexamplepercolumn

– Createthecentereddatamatrix

– ComputeitsSVD

– PrincipalcomponentsarecolumnsofN,eigenvaluesare

2-Nov-1747

SlideinspiredbyN.Vasconcelos

Lecture 11 -Stanford University

RuleofthumbforfindingthenumberofPCAcomponents

• Anaturalmeasureistopicktheeigenvectorsthatexplainp%ofthedatavariability– Canbedonebyplottingtheratiork asafunctionofk

– E.g.weneed3eigenvectorstocover70%ofthevariabilityofthisdataset

2-Nov-1748

SlideinspiredbyN.Vasconcelos

Lecture 11 -Stanford University

Whatwewilllearntoday• Introductiontofacerecognition• PrincipalComponentAnalysis(PCA)• Imagecompression

2-Nov-1749

Lecture 11 -Stanford University

OriginalImage

• Divide the original 372x492 image into patches:• Each patch is an instance that contains 12x12 pixels on a grid

• View each as a 144-D vector2-Nov-1750

Lecture 11 -Stanford University

L2 errorandPCAdim

2-Nov-1751

Lecture 11 -Stanford University

PCAcompression:144D) 60D

2-Nov-1752

Lecture 11 -Stanford University

PCAcompression:144D) 16D

2-Nov-1753

Lecture 11 -Stanford University

16mostimportanteigenvectors

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2-Nov-1754

Lecture 11 -Stanford University

PCAcompression:144D) 6D

2-Nov-1755

Lecture 11 -Stanford University

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

2 4 6 8 10 12

24681012

6mostimportanteigenvectors

2-Nov-1756

Lecture 11 -Stanford University

PCAcompression:144D) 3D

2-Nov-1757

Lecture 11 -Stanford University

2 4 6 8 10 12

2

4

6

8

10

122 4 6 8 10 12

2

4

6

8

10

12

2 4 6 8 10 12

2

4

6

8

10

12

3 most important eigenvectors

2-Nov-1758

Lecture 11 -Stanford University

PCAcompression:144D) 1D

2-Nov-1759

Lecture 11 -Stanford University

Whatwehavelearnedtoday• Introductiontofacerecognition• PrincipalComponentAnalysis(PCA)• Imagecompression

2-Nov-1760

Recommended