Transcript
Page 1: Real Time Monitoring of CCTV Camera Images …cs229.stanford.edu/proj2017/final-posters/5133020.pdfReal Time Monitoring of CCTV Camera Images Using Object Detectors and Scene Classification

Results

iPadNotifications

RealTimeMonitoringofCCTVCameraImagesUsingObjectDetectorsandSceneClassificationforRetailandSurveillance

Applications– AnandJoshi([email protected])

IntroductionCurrentsurveillanceandcontrolsystemsinretailandelsewhere,stillrequirehumansupervisionandintervention.ThisworkwilltrytoprovideadetectionsysteminCCTVvideosonrealtimebasis,appropriatefor;surveillanceandcontrol,inventorytracking,theftdeterrence,threatperceptionanddetectionetc.andapplyMachineLearning/DeepLearningtechniquesforrealworldapplications.Thiswilltrytoautomatemanytasks,whichcanbeerrorproneotherwiseduetohumanerrorsandfatigue.Thissolutioncanpotentiallyhavecapabilitytoproviderealtimealerts,notificationonsmartphones/tabletsandproviderichdataforanalyticspurpose.

DatasetTheDatasetComprisedofColorimagesinfollowingcategories:a)EveryDayObjects foundinretailenvironment,obtainedfromImageNet.Over1.2millionimagesusedfortraining,dividedinover1000classes.b)GunsandKnives:KnivesImagesDatabase,whichcontains9340negativeexamplesand3559positiveexamples,InternetMovieFirearmsDatabase,whichcontains8557imagesc)HumanHand:HandDataset whichcontainsabout14700handimagesfromvarioussources.EgoHands Dataset containingabout120000images.90% wasusedfortraining and10% wasusedforvalidation.

SomeImageSamples

ApplicationFlowandSystemSetup

DetectionusingCNN

Matchfound? yes

SendPushNotificationreq usingNODE.js

ApplePushNotificationService

FeatureExtractionandPredictionusingCNN

ConvolutionLayer:asetoflearnablefilters(kernels).Representsaspecificpartoftheimagebypreservingthespatialrelationshipbetweenpixels.Poolinglayer(subsampling):reducesthedimensionalityofeachfeaturemapbutretainsthemostimportantinformation.Fullyconnectedlayerinvolvesasoftmax functionwhichwillhelpusmaketheprediction,byexponentiationandthennormalizingtheinputs.Itsoutputrepresenttheprobabilities(confidence)ofeachclassprediction.𝒔𝒐𝒇𝒕𝒎𝒂𝒙(𝒙)𝒊 = 𝒆𝒙𝒑(𝒙𝒊)

𝚺𝒋𝒆𝒙𝒑(𝒙𝒊)

ThisismostappropriateforImageclassificationproblem

CNNArchitecture:Inception-Resnet V2.Itismoreaccuratethanpreviousstateoftheartmodels.TheTop-1andTop-5validationaccuraciesonthe ILSVRC2012imageclassificationbenchmark basedonasinglecropoftheimageis80.4&95.3respectively.DeepLearningFramework– TensorFlow,fornumericalcomputationusingdataflowgraphs.Nodesinthegraphrepresentmathematicaloperations,whilethegraphedgesrepresentthemultidimensionaldataarrays(tensors)communicatedbetweenthem.

TrainingEpochs SolverType BaseLearningRate100 StochasticGradient

Descent0.0001

#Epoch Solver LearningRate

Accuracy

100 SGD 0.0001 99.97

REFERENCES[1] A.Krizhevsky,I.Sutskever,andG.E.Hinton,“Imagenet classificationwithdeepconvolutionalneuralnetworks,”inAdvancesinneuralinformationprocessingsystems,pp.1097–1105,2012.[2]ScalableObjectDetectionusingDeepNeuralNetworksDumitru Erhan,ChristianSzegedy,AlexanderToshev,andDragomir Anguelov Google,Inc.[3]AutomaticHandgunDetectionAlarminVideosUsingDeepLearningRobertoOlmos,Siham Tabik,andFranciscoHerrera[4]CCTVobjectdetectionwithfuzzyclassificationandimageenhancement,AndrzejMATIOLAŃSKI,AleksandraMAKSIMOWA,AndrzejDZIECH,MultimediaToolsandApplications,2015[5]AutomatedDetectionofFirearmsandKnivesinaCCTVImage,Michał Grega,AndrzejMATIOLAŃSKI,PiotrGuzik,Mikołaj Leszczuk,Sensors,ISSN1424-8220

FutureDespiteInception-ResNetV2performingthebest,I foundthatmanypredictionshadaprobabilityof20%to40%,evenifthesepredictionswerecorrect.ThefirststepI wouldliketotakeistoincreasetheconfidenceinthesepredictionssothatthemodelwouldbemorewelltrained.ThiscouldbedonemytrainingitonmoredataorincreasingtheepochswhentrainingtheCNN.Alsoafterdevelopinganend-to-endProofOfConceptsolution,Istronglyfeelthatithasthepotentialofbecomingacommerciallyviableproduct

Recommended