cs231a final report - Stanford University · showing better performance, I wanted to apply general feature detectors and use traditional classifier to 1) automate the training process

CS231AFinalProjectReportWhaleFaceIdentification:Canyoudistinguishthesefaces?

SeungWooJung

AbstractIhaveformulatedNorthAtlanticRightWhaledetectorandclassifierthatisspecificallyusedtolocalizewhalefromaerialphotos,extractHoGfeature,andtrainasetofbinarySVMforidentifyingindividualwhalefromtheimage.LocalizationalgorithmusesspatialhistogramdistancecomparisonusingEarthMover’sdistanceandmedianfilteringtosuccessfullycaptureaboundingbox.IntroductionNorthAtlanticRightWhalesareoneofthemostendangeredwhalespeciesintheworld,currentlybeingprotectedbyU.S.EndangeredSpeciesAct.Mostoftheremaining400orsoindividualsliveinNorthAtlanticocean,wherescientistsareusinghelicopteraerialphotostokeeptrackofeachindividuals’migrationstatus.NationalOceanicandAtmosphericAdministration(NOAA)’sscientistsareresponsibleformanuallylabelingeachphotoasspecificindividual,yettheworkisoftenerrorproneandlaborintensive.Objectiveoftheprojectistodevelopaframeworkwhereeachwhale’sanatomicalfeaturescanbeextractedandusedasanautomatedclassificationproblem.ThisprojectisdirectlyinspiredbyKaggleCompetition.ConsiderationofImagesAnatomicalfeatureoftherightwhalefromaerialphotoislimited,buttherearesomedistinctiveparasiticgrowthsondorsalsideofeachwhalethatdifferentiatesonefromanother.Sincetherearenonoticeabledifferencebetweencolorofwhale’sbodyandwater,callosumshapeistheonlydifferentiatingfactorinwhale“face.”Differentilluminationalsocontributestodifferentwaterandwhalecolor,barringsuccessfullocalization.Anotherlimitingfactoristhedifficultyinrectificationbecauseofnoinformationaboutthedistancebetweencameraandobject.Whalesizecannotbeassumedtobestaticbecausephotoshavebeentakeninthelast10years.

Figure1:Individualwhaleshavedifferentshapesandsizesofcallosumonthecenterand/orsidesofthehead,andtheymightbecontinuousorbroken.PreviousWorkPreviousworkbydeepsense.ioutilizesCNNforlocalization,alignment,andclassificationofthewhales,bytrainingheadfeaturetoboundawhalehead.Deepsense.iohasuseddifferingdepthofCNNforeachtask,whilemanuallytrainingeachCNNwithhand-labeledlocalfeaturesofwhalesuchasblowheadandbonnet-tip.WhileCNNcapturesdifferentlayersoffeaturesfromtheobjectandthereforeissuperiorintermsofpreventingoverfittingand

showingbetterperformance,Iwantedtoapplygeneralfeaturedetectorsandusetraditionalclassifierto1)automatethetrainingprocessand2)savethecomputationtime.TechnicalMethodologyOverallsetupisaframeworkwhereeachlayer:1)localizeonwhalehead,2)aligntheheadtospecificangle,and3)extractfeaturefromthecroppedandrotatedimagetoformfeaturethatcanbesuppliedtoSVMforclassification.Therewere457individualsthatcomprisedeachoftheSVMclasses,andtherewere4,700imagesfortrainingsetand11,400imagesfortestset.Trainingsetdidnotcontainevennumberofimagessoitwasdifficulttogeneralizetherectificationoptions.

Figure2:(left)overallworkflowofimageclassification,(right)trainingsetcontainedunevenoccurrencesofwhalephotos.1)TestofFeatureExtraction

Figure3:(fromlefttoright)FAST,SURF,Harris,andHoGfeaturedetection.Beforemovingtolocalizationofwhaleitself,Ihaveexperimentedwithdescriptoralgorithmsthatarespecializedindifferentfeaturesoftheimage:FASTandHarrisdetectorsforcornersoftheimage,SURFarobust,blobdetector,andHoGforgradientdetection,whichrequiresalignment.FASTandHarrisdetectorshadfeaturespacetoosparseforcorrecthistogramgenerationrequiredforclassification,soIhadSURFandHoGastwocandidatesforfeaturedetection.

2)LocalizationEarlyattemptsinlocalizationinvolvedclusteringofdescriptorsbasedontheirlocationandintensity,butsoonIhaverealizedthatsimplek-meansormeanshiftclusteringdidnotquitecatchthedescriptorswell.Thereasonwasthatthereweretoomuchwaveandfeaturesclusteredinnon-headareaaswell.Thealgorithmforlocalizingwhaleisasfollows: 1)normalizeeachcolorchannel,andcomputemedianforentirematrixforeachcolorchannel. 2)conducthistogramequalizationforaddedcontrast 3)alsoconstructhistogramforeachcolorchannelmatrix 4)divideimageintosubgrids(50x50pixels)andcomputehistogramforeach

colorchannel5)computeEarthMover’sDistancebetweenentireimagehistogramandsubgridhistogramforeachcolorchannel,andadddistances.

-DistancematrixD[x,y]=sumofemd_distances -size(D)=[size(im,1)/subgrid_size,size(im,2)/subgrid_size]

6)interpolateafunctionbasedondistancematrixandoutputanEMDdistanceimagearray.7)addbothandnormalizetoacquireaimageheatmapthatcanbeusedtoextract

boundingbox.

Figure4:Variouspartsoflocalizationalgorithm.(Top-left)originalimage.(Top-right)aftermediansubtractionandhistogramequalizationtoenhancecontrast.(Bottom-left)subgridvs.wholeimagehistogramdistancemap,basedonEarth-Mover’sDistance.(Bottom-right)Bothimagesaddedandnormalized.

Ihaverealizedthatindetectingwhaleoutline,Ihadtotakeintoaccountthecolordifferenceandcolordistribution.Naturally,colordifferencecanbeaccountedforusingthemedianfilterandhistogramequalizationsincebyusinghistogramequalization,imageregionwithsimilarpixelintensitiescanbefurtherdistributed.Also,colordistributioninasmallenoughareacarriesacontextthatshouldbeutilized.Basicassumptionoftheproblemwasthatmostoftheimageiswaterandnotwhale,andwhale’sdorsalcallositypatternscanbesummarizedasavisualbagofwordsthatcanbetransformedtohistogramandcanbecompared.LastassumptionwasthatthosesubgridcolorpatternsofwhaleandwaterwillbedistinguishablebyEarthMover’sdistance.IhaveexperimentedwithL2normandEarthMovingDistance(EMD)forahistogramdistancecalculation.IhadageneralintuitionthatEarthMovingdistanceisametricbasedonhowprobablethatbothhistogramshavesamedistributiongivenbothobservations,butithasproventobemoreaccurate.AsaresultoflocalizationIgotboundingboxofthewhale,fromwhichIcouldcalculateangleofrotationfromtheboundingbox’sdiagonalline.Ihaverotatedcroppedimagetoalignthemthesame.3)ClassificationModelsIhaveusedstandardmulticlassSVM(ECOCSVM)withdefaultlinearclassifier.FortrainingsetIhaveextractedandaggregatedHoGfeaturethatIclusteredusingK-meansclusteringusingnum_center=20.Ihaveutilizedproblemset4spatialpyramidmodeltotrainandpredicttheECOCSVM.IcalculatedmulticlasspredictionaccuracyusingLog-loss=− !

!log (𝑝!")!

!!!!!!! forboth

SURFandHoGfeature-basedSVMs,whereIhaveusedasetofbinarytrainersthatassigned𝑝!" =0ifithtestsampledidnotbelongtoclassj,and𝑝!" =1ifitbelongedtoaclass.

ExperimentalResultsPredictionaccuracyforlog-lossofSVMhasbeenreportedas19.2and22.1forHoGfeatureandSURFfeature-basedSVM.SincebaselineforsubmissioninKagglecompetitionis34.4andwinningteamacquired0.596,therehasbeenlittleaccuracygainedbymyclassificationmethod.Possiblereasonsforsuchunsuccessfulclassificationcanbeduetolocalizationerror,particularlyinstilldistinguishingwavesapartfromwhalecallosity,andnorectification.

Figure5:Localizationfailure.(Top-left)mediansubtractionandnormalization,(Top-right)EMDdistancematriximage.(Bottom-left)aggregatedimage.(Bottom-right)boundingboxfailure.Figure5showsthatwavesarestillbeingaccountedforinmediansubtractionpartofthealgorithm,andhistogramwithEarthMover’sDistanceisnotaverygoodwaytodistinguishwhalefromthebackgroundwhencallositiesarenotvisibleandwavesarearoundthewhale.Thosewaves,sincetheywereformingaboundarytowhale’soutlines,wereconsideredtobetheregionofinterest.ConclusionandFutureDirectionInthefuture,thereshouldbeaseriesofdescriptorsthataccountnotonlyforthegenerallocalfeaturesuchascornersandgradients,butneuralnetworkofsuchfeaturesgivingspatialcontext.Suchfeaturespacewillcapturewhatsimplebag-of-wordsclassifiercouldnot,especiallyinthecasewhereobject’soutlineisnotdistinctandbackgroundnoiseisverysimilarinshapeandcolortoobjectfeaturewearelookingfor.Code:https://github.com/jungsw/cs231a

ReferencesOfirPeleandMichaelWerman,"Fastandrobustearthmover'sdistances,"inProc.2009IEEE12thInt.Conf.onComputerVision,Kyoto,Japan,2009,pp.460-467.

Documents

cs231a final report - Stanford University · showing better performance, I wanted to apply general feature detectors and use traditional classifier to 1) automate the training process