Upload
hoangdiep
View
230
Download
0
Embed Size (px)
Citation preview
CAP6412AdvancedComputerVision
http://www.cs.ucf.edu/~bgong/CAP6412.html
Boqing GongJan26,2016
Today
• Administrivia• Abiggerpictureandsomecommonquestions• Objectdetectionproposals,bySamer
Pastdue(12pmtoday)
• Assignment2:Reviewthefollowingpaper
{Major}[DetectionProposals]J.Hosang,R.Benenson,P.Dollár,andB.Schiele.Whatmakesforeffectivedetectionproposals?PAMI2015.
Templateforpaperreview:http://www.cs.ucf.edu/~bgong/CAP6412/Review.docx
Anassignmentwithnoduedates
• See“PaperPresentation”onUCFwebcourse
• Sharingyourslides• Refertotheoriginalssourcesofimages,figures,etc.inyourslides• ConvertthemtoaPDFfile• UploadthePDFfileto“PaperPresentation”afteryourpresentation
ScheduleupdateWeek2 CNNvisualization&objectrecognition
Week3 CNN&objectlocalization
Week4 CNN&transferlearning
Week5 CNN &segmentation,super-resolution
Week6 CNN&videos(opticalflow,pose)
Week7 Imagecaptioning&attentionmodel
Week8 Visualquestionanswering
Week9 Attentionmodel,aligningbookswithmovies
Week10--16 Video:tracking,action,surveillanceHuman-centered CV3DCVLow-levelCV,etc.
Nextweek:Imagecaptioning&attentionmodel
Tuesday(02/02)
Harish RaviPrakash
Karpathy, Andrej, and Li Fei-Fei. “Deep visual-semantic alignments forgenerating image descriptions.” arXiv preprint arXiv:1412.2306(2014).
& Secondary papersThursday(02/04)
Karan Daei-Mojdehi
Xu, Kelvin, Jimmy Ba, Ryan Kiros, Aaron Courville, RuslanSalakhutdinov, Richard Zemel, and Yoshua Bengio. “Show, attend andtell: Neural image caption generation with visual attention.” arXivpreprint arXiv:1502.03044 (2015).
& Secondary papers
Beginningnextclass
• Makegoodpresentations--- #3courseobjective- Title,authors(fullname),authors’institutes,yournameandemail- Motivationoftheresearch(1—2slides)- Problemstatement(1—2slides)- Maincontributionsofthepaper- Approachoutline(1slide)- Detailsoftheproposedapproach- Experiments- Relatedwork(1—3slides)- Conclusion:take-homemessage(1—2slides)- Strengths&weaknessesofthepaper(1—2slides)- Overallrating&why(howyouweighthestrengthsandweaknesses)(1slide)- Futuredirections(1—3slides)
Beginningnextclass
• Makegoodpresentations--- #3courseobjective- Title,authors(fullname),authors’institutes,yournameandemail- Motivationoftheresearch(1—2slides)- Problemstatement(1—2slides)- Maincontributionsofthepaper- Approachoutline(1slide)- Detailsoftheproposedapproach- Experiments- Relatedwork(1—3slides)- Conclusion:take-homemessage(1—2slides)- Strengths&weaknessesofthepaper(1—2slides)- Overallrating&why(howyouweighthestrengthsandweaknesses)(1slide)- Futuredirections(1—3slides)
40minsonlyLeavemetimetocover:• Underexploitedpointsinslides/discussion• Techniquedetails• Morerelatedworkandreadingreferences• Myowncomments
Today
• Administrivia• Abiggerpictureandsomecommonquestions• Objectdetectionproposals,bySamer
Whywereadthesepapers: Apersonalizedandbiasedperspective
Whywereadthesepapers: Apersonalizedandbiasedperspective
Time Event RelatedPapers Read?
01/2012 NegativeCVPRreviews
[LeNet]YannLeCun,L.Bottou,Y.Bengio,andP.Haffner.Gradient-based learningapplied todocument recognition.ProceedingsoftheIEEE,november 1998.
Yes
Whywereadthesepapers: Apersonalizedandbiasedperspective
Time Event RelatedPapers Read?
01/2012 NegativeCVPRreviews
[LeNet]YannLeCun,L.Bottou,Y.Bengio,andP.Haffner.Gradient-based learningapplied todocument recognition.ProceedingsoftheIEEE,november 1998.
Yes
10/2012 AlexNet winsILSVRC2012
[AlexNet] Krizhevsky,Alex,IlyaSutskever,andGeoffreyE.Hinton. “Imagenet classificationwithdeepconvolutionalneuralnetworks.”InNIPS,2012.
Yes
Whywereadthesepapers: Apersonalizedandbiasedperspective
Time Event RelatedPapers Read?
01/2012 NegativeCVPRreviews
[LeNet]YannLeCun,L.Bottou,Y.Bengio,andP.Haffner.Gradient-based learningapplied todocument recognition.ProceedingsoftheIEEE,november 1998.
Yes
10/2012 AlexNet winsILSVRC2012
[AlexNet] Krizhevsky,Alex,IlyaSutskever,andGeoffreyE.Hinton. “Imagenet classificationwithdeepconvolutionalneuralnetworks.”InNIPS,2012.
Yes
11/2013 Visualize&understandCNNs
[Visualization] Zeiler,MatthewD.,andRobFergus.“Visualizingandunderstanding convolutionalnetworks.”InECCV,2014.
Yes
Whywereadthesepapers: Apersonalizedandbiasedperspective
Time Event RelatedPapers Read?
01/2012 NegativeCVPRreviews
[LeNet]YannLeCun,L.Bottou,Y.Bengio,andP.Haffner.Gradient-based learningapplied todocument recognition.ProceedingsoftheIEEE,november 1998.
Yes
10/2012 AlexNet winsILSVRC2012
[AlexNet] Krizhevsky,Alex,IlyaSutskever,andGeoffreyE.Hinton. “Imagenet classificationwithdeepconvolutionalneuralnetworks.”InNIPS,2012.
Yes
11/2013 Visualize&understandCNNs
[Visualization] Zeiler,MatthewD.,andRobFergus.“Visualizingandunderstanding convolutionalnetworks.”InECCV,2014.
Yes
2014 CNN winsonobjectdetection
Girshick,Ross,JeffDonahue, TrevorDarrell,andJagannathMalik."Richfeaturehierarchiesforaccurateobjectdetectionandsemanticsegmentation."InCVPR, 2014.
ThisThursday
Basicnetworkstructures--- whereisCNN?
• Feed-forwardnetworks • Recurrentneuralnetworks
Imagecredit:http://mesin-belajar.blogspot.com/2016/01/a-brief-history-of-neural-nets-and-deep_84.html
CNN:aspecialformoffeed-forwardnetworks
• Seewhiteboard
Detour:WeightsharinginCNN
Convolutionlayer
Neuronsofthesamefeaturemapsharethesameweights(thefilter)
Significantlyreduced#parameters
Imagecredit:deeplearning.net/tutorial/lenet.html
Detour:SparseconnectioninCNN
TheLeNet [LeCun etal.’1998]
Sparseconnectionsvs. FullconnectionSmaller#parameters,betterlearningefficiency
Today
• Administrivia• Abiggerpictureandsomecommonquestions• Objectdetectionproposals,bySamer
Whatmakesforeffectivedetectionproposals?
JanHosang1,RodrigoBeneson1,PiotrDollar2,andBernt Schiele1
1MaxPlanckInstituteforInformatics2FacebookAIResearch(FAIR)
Presentedby:Samer Iskander
Motivation• Highperformingobjectdetectorsarebasedonobjectproposals,inordertoavoidexhaustiveslidingwindowsearchacrosstheimage.
• Asaresultofthat,anin-depthanalysisofdifferentmethodsisrequired,inordertostudytheirimpactondetectionperformance.
ProblemStatement
• Althoughthewidespreaduseofdetectionproposals,itisnecessarytostudytheperformancemetricstrade-offswhenemployingthem.
MainContributions• Asystematicoverviewofdetectionproposalmethodsisprovided.
• Thenotionofproposalrepeatabilityisintroduced.• Objectrecallmetricisstudiedondifferentdatasets.• Theinfluenceofdifferentproposalmethodswhenappliedonselectedobjectsdetectionalgorithms(DPM,R-CNNandFastR-CNN).
• Anovelmetric,theaveragerecall(AR),whichrewardsbothproposallocalizationandrecallperformancemetricsandeffectsthedetectionperformanceisproposed.
ApproachOutline1.DetectionProposalMethods1.1BaselineProposalMethod
2.EvaluationMetricsforObjectProposals3.ProposalRepeatability4.ProposalRecall5.UsingTheDetectionProposals5.1DetectorResponsesAroundObjects5.2LM-LLDA,R-CNNandFastR-CNNdetection
performance5.3Predictingdetectionperformance
1.DetectionProposalMethods
DetailsofTheProposedApproach
DetectionProposalMethods
Grouping ProposalMethods
• Theyattempttogeneratesegments(maybeoverlapped) thatarelikely tocorrespond toobjects
WindowScoringMethods
• Theyscoreeachcandidatewindowaccordingtohowlikelyitistocontainanobject.
• Itisfaster.• Ifnotgeneratesdensely
windows, lowlocalizationaccuracy
1.1BaselineProposalMethodA.Uniform:Togenerateproposals,itisnecessarytouniformlysampletheboundingboxcenterposition(x,y),squarerootareaandlogaspectratio.
ThePASCALVOC2007trainingsetisusedtoestimatetheseparameters.
B.Gaussian:Togenerateproposals,itisnecessarytomultivariateGaussiandistributiontheboundingboxcenterposition(x,y),squarerootareaandlogaspectratio.
C.SlidingWindow:Equallydistributedwindowsinspacearegenerated.BING(Binarized NormedGradientsforObjectness Estimationat300fps)uses29specificsizes,thismethodspreadthissizeshomogeneouslyinsidetheimage.
D.Superpixels:Superpixels aregeneratedfromEfficientGraph-BasedImageSegmentation.
2.EvaluationMetricsforObjectProposals
1.IntersectionOverUnion(IOU):• Themetricsusedforevaluatingobjectproposalsarealltypicallyfunctionsofintersectionoverunion(IOU)betweengeneratedproposalsandground-truthannotations.
• Fortwoboxes/regionsbi andbj ,IOUisdefinedas:
𝐼𝑂𝑈 𝑏%, 𝑏' =𝑎𝑟𝑒𝑎 𝑏% ∩ 𝑏'𝑎𝑟𝑒𝑎 𝑏% ∪ 𝑏'
2.Recall@IOUThresholdt:• Foreachground-truthinstance,checkwhetherthebestproposalfromlistLhasIOU>t.
• Ifso,thisground-truthinstanceisconsidereddetectedorrecalled.
• Thenaveragerecallismeasuredoveralltheground-truthinstances.
𝑟𝑒𝑐𝑎𝑙𝑙@𝑡 =1|𝐺| 5 𝐼 max
9:∈<𝐼𝑂𝑈 𝑔%, 𝑙% > 𝑡
?:∈@
I[.]isanindicatorfunctionforlogicalprepositionintheargument
• Objectproposalsareevaluatedusingthismetricintwoways:1.Plottingrecallvs.tbyfixing#proposalsinL.
2.Plottingrecallvs.#proposalsbyfixingt.
3.AverageBestOverlap(ABO):Thismetriceliminatestheneedforthethreshold.Calculatetheoverlapbetweeneachground-truthannotationgiϵGandthebestobjecthypothesisinL.
𝐴𝐵𝑂 =1|𝐺| 5 max
9:∈<𝐼𝑂𝑈 𝑔%, 𝑙%
?:∈@
4.AverageRecall(AR):
𝐴𝐵𝑂 = D|@|∑ max
9:∈<(𝐼𝑂𝑈 𝑔%, 𝑙%?:∈@ -0.5,0)
Averagerecall(forIOUbetween0.5:1)vs.#proposals
5.VolumeUnderSurface(VUS):Itplotsrecallasafunctionofbothtand#proposalsandcomputesthevolumeunderthesurface.
3.ProposalRepeatability
1.ForeachimageinthePASCALVOC2007testset,severalperturbedversionsaregenerated(blur,rotation,scale,illumination,JPEGcompression,and“saltandpepper”noise).
2.Foreachpairofreferenceandperturbedimages,detectionproposalsarecomputedwithagivenmethod(generating1000windowsperimage).3.Theproposalsareprojectedbackfromtheperturbedintothereferenceimageandthenmatchedtotheproposalsinthereferenceimage.4.Then,plotrecallvs.IOUt(0:1),andrepeatabilityistheareaunderthecurve.5.MethodsthatproposewindowsatsimilarlocationsathighIoU—andthusonsimilarimagecontent—aremorerepeatable,sincetheareaunderthecurveislarger.6.Largewindowsaremorelikelytomatchthansmalleronessincethesameperturbationwillhavealargerrelativeeffectonsmallerwindows.
• Scale:AllmethodsexceptBingshowadrasticdropwithsmallscalechanges,butsufferonlyminordegradationforlargerchanges.Bingismorerobusttosmallscalechanges;however,itismoresensitivetolargerchangesduetoitsuseofacoarsesetofboxsizeswhilesearchingforcandidates.
• JPEGCompression:Smallcompressionhasalargeeffectandmoreaggressivecompressionshowsmonotonicdegradation.Despiteusinggradientinformation,Bingismostrobusttothesekindofchanges.
• Rotation:Allproposalmethodsareaffectedbyimagerotation.Therepeatabilitylossisduetomatchingrotatedboundingboxes.
• Illumination:Methodsbasedonsuperpixels areheavilyaffected.Bingismorerobust,likelyduetouseofgradientinformationwhichisknowntobefairlyrobusttoilluminationchanges.
• Blur:Therepeatabilityresultsagainexhibitasimilartrendalthoughthedropisstronger(incomparisonwithothereffects)forasmall.
• Saltandpeppernoise:Significantdegradationinrepeatabilityforthemajorityofthemethodsoccurswhenmerelytenpixelsaremodified.
4.ProposalRecall
• Ifrepeatabilityisaconcern,theproposalmethodshouldbeselectedwithcare.
• Forobjectdetection,anotheraspectofinterestisrecall.
Dataset Description
1. PASCAL Itincludes20objectcategoriesthatarepresentedinnearly5000unconstrained images.
2.ImageNet InlargerImageNet2013,thereare200categoriesinover20,000images.
Differenttypesofobjectsareincluded thatarenotinPASCAL.
ImageNet andPASCALhavethesamenumber ofobjects/imageandsizeofobjects.
3.MSCOCO MicrosoftCommonobjectsinContext(MSCOCO)hasmoreobjects/image,smallerobjects,butfewerobjectclasses(80objectcategories).
Overall,themethodsfallintotwogroups:1.WelllocalizedmethodsthatgraduallyloserecallastheIoU thresholdincreases.2.Methodsthatonlyprovidecoarseboundingboxlocations,sotheirrecalldropsrapidly.
5.UsingTheDetectionProposals
• Thisisananalysisofdetectionproposalstobeusedwithobjectdetection.
• Themain2goals:1. Measuringtheperformanceofproposal
methodsforobjectdetection.2. Theeffectofobjectproposalsmetriconfinal
detectionperformance.
5.1DetectorResponsesAroundObjects
• Itisnecessarytochecktheimportanceandrelationshipbetweenwelllocalizedproposals(highIOU)andobjectdetection(recall).
5.2LM-LLDA,R-CNNandFastR-CNNdetectionperformance
1. ApplyLM-LLDAmodelstogeneratedensedetectionsusingthestandardslidingwindow.
2. Applydifferentobjectproposalstofilterthesedetectionsattesttime.
*Thesestepsareusedtoevaluatetheeffectofproposalsondetectionquality.
• Usingonly1000proposals,thedetectionqualityisreduced.
• But,methodswithhighaveragerecall(AR)alsohavehighmeanaverageprecision(mAP),andviceversa.
• Fromtablebelow:(1)clearlyhurtperformance(bicycle,boat,bottle,car,chair,horse,mbike,person),reducingtherecallandprecisionbecauseofbadlocalization.(2)improveperformance(cat,table,dog),(3)donotshowsignificantchange(allremainingclasses).
• FastR-CNNafterre-trainingforeachmethod.• Intherightmostcolumn,FastR-CNNtrainedwith1000SelectiveSearch proposalsandappliedattesttimewithagivenproposalmethod,versusFastR-CNNtrainedforthetesttimeproposalmethod.
5.3Predictingdetectionperformance
RelatedWork:
FasterR-CNN:TowardsReal-TimeObjectDetectionwithRegionProposalNetworks
Shaoqing Ren1,Kaiming He2,RossGirshick,andJianSun3
1UniversityofScienceandTechnologyofChina2MicrosoftResearch
3FacebookAIResearch
• Thisobjectdetectionsystemiscomposedoftwomodules.Thefirstmoduleisadeepfullyconvolutionalnetworkthatproposesregions,andthesecondmoduleistheFastR-CNNdetectorthatusestheproposedregions.
• TheRPNmoduletellstheFastR-CNNmodulewheretolook.
• ARegionProposalNetwork(RPN)takesanimage(ofanysize)asinputandoutputsasetofrectangularobjectproposals,eachwithanobjectness score.
• Forregionproposalsgeneration,slideasmallnetworkovertheconvolutionalfeaturemapoutputbythelastsharedconvolutionallayer.
• Thissmallnetworktakesasinputannxn spatialwindowoftheinputconvolutionalfeaturemap.
• Eachslidingwindowismappedtoalower-dimensionalfeature(256-dforZFand512-dforVGG,withReLU following).
• Thisfeatureisfedintotwosiblingfullyconnectedlayers—abox-regressionlayer(reg)andabox-classificationlayer(cls).
Conclusion• Thispaperrevisitsthemajorityofexistingdetectionproposalmethods,proposednewevaluationmetrics,andperformedanextensiveanddirectcomparisonofexistingmethods.
• Therepeatabilityofallproposalmethodsislimited:smallchangestoanimagecauseanoticeablechangeinthesetofproducedproposals.
• Forobjectdetection,improvingproposallocalizationaccuracy(improvedIoU)isasimportantasimprovingrecall.
• Tosimultaneouslymeasurebothproposalrecallandlocalizationaccuracy,averagerecall(AR)summarizesthedistributionofrecallacrossarangeofoverlapthresholds.
Strengths
• Thispaperprovidesanewmetric,AverageRecall(AC),thatrelatesbetweenaccuracy(recall)andgoodlocalization(IOU).
• Itdemonstratesdifferentevaluationprotocoltocomparebetweenproposalmethods(repeatability,recallandusingproposalmethodsforobjectdetection).
Weaknesses
• Thispaperdependsonlyon12proposalmethods,becausetheirimplementationsareavailable.
• Thebaselineproposalmethodsarenotalgorithms(uniform,Gaussian,slidingwindowandsuperpixels).
OverallRating• MyRatingScale(0-5):1ThenewperformancemetricwhichisAverageRecall(AC)isjustanAverageBestOverlap(ABO)withinrange0.5:1
Comparisonistakenplacebetween12proposalmethodsonly.