Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
CAP5415-ComputerVisionLecture13-SupportVectorMachinesfor
ComputerVisionApplica=ons
GuestLecturer:Dr.BoqingGong
10/6/15Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
1
Reminders• October14– Chooseyourmini-projects(both).– Sendemailwithashortproposal/explana=on.
• October8– DueforProgrammingAssignment#3
10/6/15
2
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
PaWernClassifica=onProblem• Supposewearegiventwoclassesofobjects,wearethenfacedwithanewobjectandwehavetoassignittooneofthetwoclasses.
10/6/15
3
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Mo=va=on10/6/15
4
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
denotes+1denotes-1
Howwouldyouclassifythisdata?
Mo=va=on10/6/15
5
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
denotes+1denotes-1
Howwouldyouclassifythisdata?
Mo=va=on10/6/15
6
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
denotes+1denotes-1
Howwouldyouclassifythisdata?
Mo=va=on10/6/15
7
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
denotes+1denotes-1
Howwouldyouclassifythisdata?
Mo=va=on10/6/15
8
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
denotes+1denotes-1
Anyofthesewouldbefine....butwhichisbest?
ClassifierDesign10/6/15
9
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
denotes+1denotes-1 Definethemarginof
alinearclassifierasthewidththattheboundarycouldbeincreasedbybeforehi`ngadatapoint.
MaximumMargin10/6/15
10
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
denotes+1denotes-1 Themaximum
marginlinearclassifieristhelinearclassifierwiththe,um,maximummargin.ThisisthesimplestkindofSVM(CalledanLSVM)
MaximumMargin10/6/15
11
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
denotes+1denotes-1 Themaximum
marginlinearclassifieristhelinearclassifierwiththe,um,maximummargin.ThisisthesimplestkindofSVM(CalledanLSVM)
SupportVectorsarethosedatapointsthatthemarginpushesupagainst
LinearSVM
MaximumMargin10/6/15
12
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
denotes+1denotes-1 Themaximum
marginlinearclassifieristhelinearclassifierwiththe,um,maximummargin.ThisisthesimplestkindofSVM(CalledanLSVM)
SupportVectorsarethosedatapointsthatthemarginpushesupagainst
1. Intuitively this feels safest.
2. If we’ve made a small error in the location of the boundary (it’s been jolted in its perpendicular direction) this gives us least chance of causing a misclassification.
3. LOOCV is easy since the model is immune to removal of any non-support-vector datapoints.
4. There’s some theory (using VC dimension) that is related to (but not the same as) the proposition that this is a good thing.
5. Empirically it works very very well.
SVM• Asupervisedapproachforclassifica=onandregression
10/6/15
13
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
SVM• Asupervisedapproachforclassifica=onandregression– Developedinthecomputersciencesocietyat1990s,hasgrowninpopularitysincethen.
10/6/15
14
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
SVM• Asupervisedapproachforclassifica=onandregression– Developedinthecomputersciencesocietyat1990s,hasgrowninpopularitysincethen.
– Showntoperformwellinvarietyofse`ngs,andocenconsideredasoneofthebest“out-of-thebox”classifiers.
10/6/15
15
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
SVM• Anapproachforclassifica=on– Developedinthecomputersciencesocietyat1990s,hasgrowninpopularitysincethen.
– Showntoperformwellinvarietyofse`ngs,andocenconsideredasoneofthebest“out-of-thebox”classifiers.
10/6/15
16
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Max-marginClassifier
SupportVectorClassifier
SupportVector
Machines
(historicalappearancesintheliterature)
Maximal-MarginClassifier• Inap-dimensionalspace,ahyperplaneisaflataffinesubspaceofdimensionp-1.
10/6/15
17
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Maximal-MarginClassifier• Inap-dimensionalspace,ahyperplaneisaflataffinesubspaceofdimensionp-1.– Ex.In2D,ahyperplaneis1Dline,– Ex.In3D,ahyperplaneis2Dplane.
10/6/15
18
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Maximal-MarginClassifier• Inap-dimensionalspace,ahyperplaneisaflataffinesubspaceofdimensionp-1.– Ex.In2D,ahyperplaneis1Dline,– Ex.In3D,ahyperplaneis2Dplane.
10/6/15
19
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
�0 + �1X1 + �2X2 + ....�pXp = 0Generalhyperplanedefini=on
Maximal-MarginClassifier• Inap-dimensionalspace,ahyperplaneisaflataffinesubspaceofdimensionp-1.– Ex.In2D,ahyperplaneis1Dline,– Ex.In3D,ahyperplaneis2Dplane.
10/6/15
20
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
�0 + �1X1 + �2X2 = 0
�0 + �1X1 + �2X2 + ....�pXp = 0Generalhyperplanedefini=on
Hyperplanefor2Ddata.
Maximal-MarginClassifier• Inap-dimensionalspace,ahyperplaneisaflataffinesubspaceofdimensionp-1.– Ex.In2D,ahyperplaneis1Dline,– Ex.In3D,ahyperplaneis2Dplane.
10/6/15
21
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
�0 + �1X1 + �2X2 = 0
�0 + �1X1 + �2X2 + ....�pXp = 0Generalhyperplanedefini=on
Hyperplanefor2Ddata.
1 + 2X1 + 3X2 = 0
1 + 2X1 + 3X2 > 0
1 + 2X1 + 3X2 < 0
Max-marginclassifier10/6/15
22
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Lec:Therearetwoclassesofobserva=ons:blueandinpurple(eachofwhichhasmeasurementsontwovariables).Threesepara=nghyperplanes,outofmanypossible,areshowninblack.Right:Asepara=nghyperplaneisshowninblack:atestobserva=onthatfallsinthebluepor=onofthegridwillbeassignedtotheblueclass,andatestobserva=onthatfallsintothepurplepor=onofthegridwillbeassignedtothepurpleclass.
Max-marginclassifier10/6/15
23
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Therearetwoclassesofobserva=ons,showninblueandinpurple.Themaximalmarginhyperplaneisshownasasolidline.Themarginisthedistancefromthesolidlinetoeitherofthedashedlines.Thetwobluepointsandthepurplepointthatlieonthedashedlinesarethesupportvectors,andthedistancefromthosepointstothemarginisindicatedbyarrows.Thepurpleandbluegridindicatesthedecisionrulemadebyaclassifierbasedonthissepara=nghyperplane.
Construc=onofMax-MarginClassifier• Considerntrainingobserva=ons
10/6/15
24
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
x1, . . . , xn 2 Rp
Construc=onofMax-MarginClassifier• Considerntrainingobserva=ons• andassociatedclasslabels
10/6/15
25
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
x1, . . . , xn 2 Rp
y1, . . . , yn 2 {�1,+1}
Construc=onofMax-MarginClassifier• Considerntrainingobserva=ons• andassociatedclasslabels• Briefly,max-marginhyperplaneisthesolu=ontotheop=miza=onproblem:
10/6/15
26
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
x1, . . . , xn 2 Rp
y1, . . . , yn 2 {�1,+1}
Construc=onofMax-MarginClassifier• Considerntrainingobserva=ons• andassociatedclasslabels• Briefly,max-marginhyperplaneisthesolu=ontotheop=miza=onproblem:
• Thisequa=onensuresthateachobserva=onisonthecorrectsideofthehyperplaneandatleastadistanceMfromthehyperplane.Hence,Mrepresentsthemarginofourhyperplane.
10/6/15
27
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
x1, . . . , xn 2 Rp
y1, . . . , yn 2 {�1,+1}
Non-SeparableCase10/6/15
28
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
ThemaximalmarginclassifierisaverynaturalwaytoperformClassifica=on,ifasepara=nghyperplaneexists.Inmanycasesthereisnosepara=nghyperplaneexists!Inthiscase,wecannotexactlyseparatethetwoclasses.(However,No=cesocmargininthefollowingslides!)Thegeneraliza=onofthemaximalmarginclassifiertothenon-separablecaseisknownasthesupportvectorclassifier.
SupportVectorClassifier-Separable10/6/15
29
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
• Separableplaneisshown.
• Decisionboundaryisthesolidline.• BrokenlinesboundtheShadedmaximalmarginof2M.
SupportVectorClassifier-Overlap10/6/15
30
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
• Overlapcaseisshown.
• ThepointslabeledξiareonthewrongsideoftheMargin.• Themarginismaximizedsubjecttoatotalbudget
X⇠i constant
IsMax-MarginRobust?10/6/15
31
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Lec:Twoclassesofobserva=onsareshowninblueandinpurple,alongwiththemaximalmarginhyperplane.Right:Anaddi=onalblueobserva=onhasbeenadded,leadingtoadrama=cshicinthemaximalmarginhyperplaneshownasasolidline.ThedashedlineindicatesthemaximalMarginhyperplanethatwasobtainedintheabsenceofthisaddi=onalpoint.Max-marginclassifierissensi=vetoindividualobserva=ons!
Soc-Margin~SupportVectorClassifier• Ratherthanseekingthelargestpossiblemarginsothatevery
observa=onisnotonlyonthecorrectsideofthehyperplanebutalsoonthecorrectsideofthemargin,weinsteadallowsomeobserva=onstobeontheincorrectsideofthemargin,oreventheincorrectsideofthehyperplane.
10/6/15
32
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Soc-Margin~SupportVectorClassifier• Ratherthanseekingthelargestpossiblemarginsothatevery
observa=onisnotonlyonthecorrectsideofthehyperplanebutalsoonthecorrectsideofthemargin,weinsteadallowsomeobserva=onstobeontheincorrectsideofthemargin,oreventheincorrectsideofthehyperplane.
• C:nonnega=vetuningparams,eps:slackvariablesallowingobs.tobeonthewrongSideofthemargin.
10/6/15
33
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
SVClassifier10/6/15
34
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Le#:Asupportvectorclassifierwasfittoasmalldataset.Thehyperplaneisshownasasolidlineandthemarginsareshownasdashedlines.Purpleobserva=ons:Observa=ons3,4,5,and6areonthecorrectsideofthemargin,observa=on2isonthemargin,andobserva=on1isonthewrongsideofthemargin.Blueobserva=ons:Observa=ons7and10areonthecorrectsideofthemargin,observa=on9isonthemargin,andobserva=on8isonthewrongsideofthemargin.Noobserva=onsareonthewrongsideofthehyperplane.Right:Twoaddi=onalpoints,11and12areadded.
SVClassifier10/6/15
35
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
FourdifferenttuningparamCWereusedtofitSVMintoSmallnumberofdata.ThelargestvalueofCwasusedinthetoplecpanel,andsmallervalueswereusedinthetopright,boWomlec,andboWomrightpanels.WhenCislarge,thenthereisahightoleranceforobserva=onsBeingonthewrongsideofthemargin,andsothemarginwillbelarge.AsCdecreases,thetoleranceforobserva=onsbeingonthewrongsideofthemargindecreases,andthemarginnarrows.
SVClassifier10/6/15
36
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Inprac,cewearesome,mesfacedwithnon-linearclassboundaries.SupportvectorclassifieroranylinearclassifierwillperformpoorlyhereLe#:Theobserva=onsfallintotwoclasses,withanon-linearboundarybetweenthem.Right:Thesupportvectorclassifierseeksalinearboundary,andconsequentlyperformsverypoorly.
Non-LinearSupportVectorMachines10/6/15
37
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
n Generalidea:theoriginalinputspacecanalwaysbemappedtosomehigher-dimensionalfeaturespacewherethetrainingsetisseparable:
Φ: x → φ(x)
TheKernelTrickn The linear classifier relies on dot product between vectors K(xi,xj)=xi
Txj
n If every data point is mapped into high-dimensional space via some transformation Φ: x → φ(x), the dot product becomes:
K(xi,xj)= φ(xi) Tφ(xj)
n A kernel function is some function that corresponds to an inner product in
some expanded feature space.
n Example: 2-dimensional vectors x=[x1 x2]; let K(xi,xj)=(1 + xi
Txj)2,
Need to show that K(xi,xj)= φ(xi) Tφ(xj):
K(xi,xj)=(1 + xiTxj)2
,
= 1+ xi12xj1
2 + 2 xi1xj1 xi2xj2+ xi2
2xj22 + 2xi1xj1 + 2xi2xj2
= [1 xi12 √2 xi1xi2 xi2
2 √2xi1 √2xi2]T [1 xj12 √2 xj1xj2 xj2
2 √2xj1 √2xj2] = φ(xi)
Tφ(xj), where φ(x) = [1 x12 √2 x1x2 x2
2 √2x1 √2x2]
10/6/15
38
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
ExampleofKernelFunc=ons10/6/15
39
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
n Linear:K(xi,xj)=xiTxj
n Polynomialofpowerp:K(xi,xj)=(1+xiTxj)p
n Gaussian(radial-basisfunc=onnetwork):
n Sigmoid:K(xi,xj)=tanh(β0xiTxj+β1)
)2
exp(),( 2
2
σji
ji
xxxx
−−=K
Non-linearSVMMathema=cally10/6/15
40
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
n Dual problem formulation:
n The solution is:
n Optimization techniques for finding αi’s remain the same!
Find α1…αN such that Q(α) =Σαi - ½ΣΣαiαjyiyjK(xi, xj) is maximized and (1) Σαiyi = 0 (2) αi ≥ 0 for all αi
f(x) = ΣαiyiK(xi, xj)+ b
Formula=ngtheop=miza=onproblem10/6/15
41
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
41
iξ
Var1
Var2 1w x b⋅ + = −r r
1w x b⋅ + =r r
0=+⋅ bxw !!11
w!
iξ
Constraintbecomes:
Objec=vefunc=onpenalizesformisclassifiedinstancesandthosewithinthemargin
Ctrades-offmarginwidthandmisclassifica=ons
( ) 1 , 0
i i i i
i
y w x b xξξ
⋅ + ≥ − ∀≥
21min2 i
iw C ξ+ ∑
Non-LinearSVMOverviewn SVMlocatesasepara=nghyperplaneinthefeaturespaceandclassifypointsinthatspace
n Itdoesnotneedtorepresentthespaceexplicitly,simplybydefiningakernelfunc=on
n Thekernelfunc=onplaystheroleofthedotproductinthefeaturespace.
10/6/15
42
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
10/6/15
43
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
43
DisadvantagesofLinearDecisionSurfaces
Var1
Var2
10/6/15Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Advantagesofnon-linearDecisionSurfaces
44
Var1
Var2
SVMClassifier10/6/15
45
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Le#:AnSVMwithapolynomialkernelofdegree3isappliedtothenon-lineardatafrompreviousslides,resul=nginafarmoreappropriatedecisionrule.Right:AnSVMwitharadialkernelisapplied.Inthisexample,eitherkerneliscapableofcapturingthedecisionboundary.
Mul=-classSVM• SVMscanonlyhandletwo-classoutputs(i.e.acategoricaloutputvariablewitharity2).
• Whatcanbedone?• Answer:withoutputarityN,learnNSVM’s
– SVM1learns“Output==1” vs“Output!=1”– SVM2learns“Output==2” vs“Output!=2”– :– SVMNlearns“Output==N” vs“Output!=N”
• Thentopredicttheoutputforanewinput,justpredictwitheachSVMandfindoutwhichoneputsthepredic=onthefurthestintotheposi=veregion.
10/6/15
46
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Trade-offBetweenFlexibilityandInterpretability10/6/15
47
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Ingeneral,astheflexibilityofamethodincreases,itsinterpretabilitydecreases.
WhydoSVMgeneralize?• Eventhoughtheymaptoaveryhigh-dimensionalspace– Theyhaveaverystrongbiasinthatspace– Thesolu=onhastobealinearcombina=onofthetraininginstances
• LargetheoryonStructuralRiskMinimiza=onprovidingboundsontheerrorofanSVM– Typicallytheerrorboundstooloosetobeofprac=caluse
10/6/15
48
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Prac=calIssues• Choiceofkernel-Gaussianorpolynomialkernelisdefault-ifineffec=ve,moreelaboratekernelsareneeded-domainexpertscangiveassistanceinformula=ngappropriate
similaritymeasures• Choiceofkernelparameters-e.g.σinGaussiankernel-σisthedistancebetweenclosestpointswithdifferent
classifica=ons-Intheabsenceofreliablecriteria,applica=onsrelyontheuse
ofavalida=onsetorcross-valida=ontosetsuchparameters.• Op=miza=oncriterion–Hardmarginv.s.Socmargin-alengthyseriesofexperimentsinwhichvariousparametersare
tested
10/6/15
49
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
CVApplica=onofSVM:HumanDetec=on10/6/15
50
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
FinalFeatureVectorsGotoSVM
CVApplica=onofSVM:PedestrianDetec=on10/6/15
51
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
Featurevectors:HOG:histogramofgradients
CVApplica=onofSVM:PedestrianDetec=on10/6/15
52
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons
ReferencesandSliceCredits• AnexcellenttutorialonVC-dimensionandSupportVectorMachines:
C.J.C.Burges.AtutorialonsupportvectormachinesforpaWernrecogni=on.DataMiningandKnowledgeDiscovery,2(2):955-974,1998.hWp://citeseer.nj.nec.com/burges98tutorial.html
• TheVC/SRM/SVMBible:Sta=s=calLearningTheorybyVladimirVapnik,Wiley-Interscience;1998
• AndrewW.Moore,CMUJames,WiWen,Has=e,Tibshirani:AnIntroduc=ontoSta=s=calLearning
10/6/15
53
Lecture13:SupportVectorMachinesforComputerVisionApplica=ons