View
2
Download
0
Category
Preview:
Citation preview
DOI: 10.4018/IJSKD.2020010101
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
Copyright©2020,IGIGlobal.CopyingordistributinginprintorelectronicformswithoutwrittenpermissionofIGIGlobalisprohibited.
1
Copy-Move Forgery Detection Based on Automatic Threshold EstimationAya Hegazi, Faculty of Computers and Informatics, Benha University, Benha, Egypt
Ahmed Taha, Faculty of Computers and Informatics, Benha University, Benha, Egypt
Mazen Mohamed Selim, Faculty of Computers and Informatics, Benha University, Benha, Egypt
ABSTRACT
Recently,usersandnewsfollowersacrosswebsitesfacemanyfabricatedimages.Moreover,itgoesfarbeyondthattothepointofdefamingorimprisoningaperson.Hence,imageauthenticationhasbecomeasignificantissue.Oneofthemostcommontamperingtechniquesiscopy-move.Keypoint-basedmethodsareconsideredasaneffectivemethodfordetectingcopy-moveforgeries. Insuchmethods,thefeatureextractionprocessisfollowedbyapplyingaclusteringtechniquetogroupspatiallyclosekeypoints.Mostclusteringtechniqueshighlydependontheexistenceofaspecificthresholdtoterminatetheclustering.Determinationofthemostsuitablethresholdrequiresahugeamountofexperiments.Inthisarticle,acopy-moveforgerydetectionmethodisproposed.Theproposedmethodisbasedonautomaticestimationoftheclusteringthreshold.Thecutoffthresholdofhierarchicalclusteringisestimatedautomaticallybasedonclusteringevaluationmeasures.Experimentalresultstestedonvariousdatasetsshowthattheproposedmethodoutperformsotherrelevantstate-of-the-artmethods.
KEywoRDSClustering Evaluation Measures, Copy-Move Detection, Image Forensics, Keypoint-Based Methods, Multiple-Copied Matching
1. INTRoDUCTIoN
Digitalimagesareeverywhere,andtheyhavethepowertodoinfinitelymorethanadocument.Inthelatterhalfofthelasttwodecades,theinternet,mobiletechnology,andobsessionofsocialmediahavehighlyaffectedandchangedpeople’slives(Katta&Patro,2017;Mahajanetal.,2018;Muliawatetal.,2019).Recently,thereisarapidincreaseinimagesshowinginthemediaasinsocialmediaandtelevisionthatdon’tseemtobeallastheyappear.Authenticityofdigitalimagesisacriticalissue.Daybyday,itbecomeseasyforanyonetomanipulateimagesevenwithoutleavinganyvisibleclues.WideavailabilityofpowerfulimageprocessingsoftwarelikePhotoshopandGimpmakesitmorechallengingfordigitalimageauthentication.
Digitalimageforensicsisthescienceofdetectingtamperedregionsinimages.Identifyingtheauthenticityofdigitalimagesisveryimportantindigitalforensics.Thepurposeofdigitalimagemanipulationistoconcealorhideinformationforseveralintentionsthereforechangetheirmeaning.Manyareashavebeenaffectedbydigitalforensics.Theimpactofimagemanipulationinmedia,journalism,digitalcinema,newsandinpoliticstomisleadthepublicopinion.Itcouldalsobeusedinlawformiscarryingjustice.Manipulatedimagesalsohavebeenfoundinacademicpapers.InasurveybyTijdink(Tijdinketal.,2014),inthepastthreeyears,15%ofoffendersareinvolvedinscientific
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
2
misconductsuchasfabricating,refutationormanipulatingdata.Astudyby(Farid,2006)reportedthatintheJournalofCellBiologyabout20%ofadmittedmanuscriptshaveatleastonefigurethatmustberestoredduetounsuitableimagemanipulation,andabout1%aredeceitfulfigures.Theseconsequencesmakeimageauthenticitieslesstrustful.
Rapidgrowthofforgedimagesandtheirinfluenceinmanyareashaveledtothedevelopmentoftamperingdetectiontechniques.Tamperingdetectiontechniquesfallundertwocategories:activeauthenticationandpassiveauthenticationmethods(Al-Qershi&Khoo,2013)asshowninFigure1.Activeauthenticationmethodsrequiresomepreprocessingondigitalimageslikewatermarking,orsignatures.Digitalwatermarkingconcealsawatermarkintotheimageatthecapturingendandextractsitattheauthenticationendtoexaminewhethertheimagehasbeentamperedwith(Al-Qershi&Khoo,2013).Insertingthewatermarkeitheratthecapturingtimeoftheimageusingaspeciallyequippedcameraorlaterbyanauthorizedpersonisthemaindrawbackofwatermarking(Qureshi&Deriche,2015).Moreover,mostofthecamerastodayarenotequippedwithawatermarkembeddingtechnique.Inaddition,thesubsequentprocessingoftheoriginalimagecoulddegradetheimagevisualquality.Moreover,digitalsignatureissimilartodigitalwatermarking.Attheimage-capturingend,uniquefeaturesareextractedfromtheimageasasignature.Attheauthenticationanddetectionend,thesignatureisregeneratedusingthesamemethodandtheauthenticityoftheimagecanbeidentifiedandverifiedthroughcomparison.Digitalsignatureshavethesamedrawbacksofdigitalwatermarking.
Ontheotherhand,Passive(blind)authenticationmethodsauthenticateimageswhilenotrequiringany previous information of it. It relies on the traces left on the image during manipulation byvariousprocessingoperations.Therefore,passiveauthenticationmethodsareconsideredasthemostcommon(Linetal.,2018).Passivedetectiontechniquescanbeclassifiedtoforgery-typedependentorforgery-typeindependent.Forgery-typeindependenttechniquesdetectforgeriesregardlessofthetypeoftheforgery.Todetectgeneraltampering,theindependenttechniquesexploitthreediversetypesofartifacts:tracesofre-sampling,compressionandinconsistencies(Redietal.,2011).Theforgery-typedependenttechniquesareusedforcertaintypesofforgeries.Copy-moveandsplicingareexamplesofforgery-typedependent(seeFigure1).Suchtechniquesdependoncopyingandpastingimageregionseitherfromthesameimage(copy-move),orfromdifferentimages(splicing).Imagesplicingiscreatedfromatleasttwodifferentimages(Sharma&Ghanekar,2019;Walia&Kumar,2018).AnExampleofimagesplicingisshowninFigure2(d).
Copy-moveorcloningisatechniqueofcopyingaregionandpastingitinthesameimage.Itcontainsatleasttworegionsalike(seeFigure2(b)).Sincetheduplicatedregionsarefromthesameimage,theyinheritthesamebasicimagepropertiessuchascolorpalette,illuminationconditionsandnoise.Copy-moveforgeryisthemostcommontypeusedforimagemanipulationduetoitssimplicityandeffectiveness(Al-Qershi&Khoo,2013;Bakiahetal.,2016).Althoughthistechniqueiseasytoimplement,itishardtodetect.Ofteninpractice,forgeryisnotjustlimitedtocopyingandpastingtheregions,someprocessingoperationsareappliedtotheseregions.Theseoperationscanbeclassifiedtointermediateoperations(geometrictransformations)andpost-processingoperations.Intermediateoperationsareusedtoprovideaspatialsynchronizationandhomogeneitybetweenthecopiedregionanditsneighbors(Al-Qershi&Khoo,2013;Bakiahetal.,2016).Examplesofintermediateoperationsarerotationandscaling.Post-processingoperationsareusedtoremovetracesleftfromforgeryandtomakeitunnoticeable.Additivenoise,JPEGcompressionandblurringareexamplesofpost-processingoperation (Liu et al., 2010). Since all those operations make detecting copy-move forgery morechallenging,numerousmethodshavebeenproposedforCopy-MoveForgeryDetection(CMFD).Mostofthemcanbeclassifiedeitherintoblock-basedmethodsorkeypoint-basedmethods(Bakiahetal.,2016).Inblock-basedmethods,theimageisdividedintooverlappingornon-overlappingblocksoffixedsize.Ontheotherhand,keypoint-basedmethodscalculatelocalinterestpoints(keypoints)fromthewholeimagewithoutanysubdivisions.
Generally,CMFDtechniques followacommonpipelineasshown inFigure3 (Christleinetal.,2012).Bothblock-basedandkeypoint-basedmethodsfollowthesamepipelinestepsexceptfor
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
3
featureextraction.ThefirststageofCMFDpipelineispreprocessing.Itisanoptionalstageanditdependsonthetechniqueused.Itaimstoimprovetheimagedataeitherbyenhancingtheimagefeaturesorbyremovingsomeunwanteddistortion.ConvertingRGBimagetograyscaleisoneofthemostusedmethodsforpre-processing(Bakiahetal.,2016).Afterthisconversion,featuresareextractedeitherfromthedividedblocksinblock-basedmethodsorforthekeypointsthentheyarestoredinafeaturevector.Inthematchingstage,eachfeaturevectoriscomparedwitheachothertofindsimilaritieswithinthesameimage.Inthisstageoncethematchedblocksaredetected,copy-moveforgerymanipulationsaredetermined.Thematchingtechniqueuseddependsontheextractedfeaturesofblock-basedorkeypoint-basedmethods.Inthefilteringprocess,outliersandfalsematchesareremoved.Finally,thecopy-moveforgerydetectionresultcanbevisualizedtolocalizethetamperedregionsintheforgedimage.Visualizationcanbefurtherrefinedbymorphologicaloperationsuchasfillingtheholes.Thereisalwaysamotivationforpresentingacopy-movedetectiontechniquewithefficientcomplexityandrobustnessagainstimageprocessingoperations.
The rest of the paper is structured as follows. Section 2 introduces related work. Section 3explainsourproposedmethod.Theexperimentalresultsaregiveninsection4.Finally,section5concludesthepaper.
2. RELATED woRK
Intheliterature,severaltechniquesinbothblock-basedandkeypoint-basedhavebeenintroducedforcopy-moveforgerydetection.Amongtheblock-basedmethods,DiscreteCosineTransform(DCT)isconsideredasthemostwidelyused.(Fridrichetal.,2003)suggestthefirstmethodfordetectingcopy-moveforgery;theyused256coefficientsofDCTasfeatures.FurtherimprovementsbasedonDCThavebeenintroducedin(Caoetal.,2011b,2011a;Huetal.,2011;Junhong,2010).Although
Figure 1. Existing image forgery detection techniques
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
4
block-basedmethodsare robust tonoiseandJPEGcompression, theyare sensitive togeometrictransformationslikerotationandscalingandoftenresultinsignificantfalsepositives.Moreover,theyhavelargefeaturevectorsize.Itresultsinahighcomputationalcomplexity(Christleinetal.,2012).Incontrast,keypoint-basedmethodsoutperformblock-basedmethods.Theymatchfeaturesofthe
Figure 2. Examples of blind forgery: (a) and (c) Original images; (b) Copy-move forged image; (d) Spliced forged image
Figure 3. Common processing pipeline for copy-move forgery detection
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
5
imageinsteadofblocks.Hence,itresultsinlesscomputationalcomplexityandminimummemoryconsumption(Dadaetal.,2016).
Ontheotherhand,keypoint-basedmethodsexhibitthemostaccurateandstableresultsinthepresenceofgeometricaltransformations(e.g.scaling,rotation,andaffinetransformation).Scale-InvariantFeatureTransform(SIFT)andSpeededUpRobustFeatures(SURF)aremostwidelyusedkeypoint-basedmethods. (Amerinietal.,2011) introduceaSIFT-basedmethodinwhichfeaturevectorsareextractedbySIFT.Then,theyarematchedusinggeneralized2-nearestneighbor(g2NN)algorithmfollowedbyagglomerativehierarchicalclustering.Althoughthismethodcandealwithmultiplecopy-moveforgery,itfailstolocalizecopymoveregionsaccurately.Inaddition,itisunabletoseparateduplicatedregionsthatarenearoneanother.Moreover,sometimesclusteringcouldbeunacceptablebecauseitneedstosetanempiricalthresholdtostoptheprocess.
ShivakumarandBaboo(B.L.Shivakumar&Baboo,2011)usedHarrisdetectoraskeypointsdetectorwhileSIFTisusedasadescriptor.Thek-dimensionaltree(kd-tree)algorithmisutilizedformatchingkeypointsandfordetectingduplicatedregions.PanandLyu(Pan&Lyu,2010)presentanotherSIFT-baseddetectionalgorithm.ThedetectedSIFTkeypointsarematchedusingthebest-bin-firstalgorithmfollowedbyRandomSampleConsensus(RANSAC)algorithmforgeometrictamperingestimation.However,quantitativeresultsonarealisticdatasetarenotofferedandcannotaccuratelydetecttamperedregionsbecauseofthelackofcorrectmatchedpoints.Thismethodhasadeficientperformancewhendetectingsmallduplicateregions.In(Lietal.,2015),theypresentamethodbasedonSIFTandsegmentation.TheysuggesttheuseofEMalgorithmfortransformestimationrefinementtoreducefalsematches.However,theirmethodsuffersfromhighcomputationalcomplexity.Theirworkistestedusingtwodatasetsandcannotdetectplaincopy-moveforgeryaccurately.Moreover,theyidentifiedsomeunforgedimagesasforged.Yangetal.(Yangetal.,2017)presentamethodbasedonhybridfeatures.TheyusedinterestpointdetectorcalledKAZEcombinedwithSIFTforfeatureextraction.KAZEisaJapanesewordthatmeanswind.
Althoughkeypoint-basedmethods recordagoodperformance, they still suffer fromseveralissuesthatcriticallyaffectthetamperingdetection.Mostpriormethodsempiricallyselectthresholdsanddonotconsideritsrelationshipwiththeimageandforgedregionsize.Furthermore,theycannotdetectenoughkeypointsinflatareas,whichresultinmanyfalsenegativesandinaccuratelocalization.Generally,aneffectiveandefficientcopy-moveforgerydetectiontechniqueshouldberobusttoimageprocessingoperation(i.e.intermediateandpost-processingoperations).Moreover,itshouldaccuratelylocalizetamperedregionsandattainlesscomputationalcomplexity.
3. PRoPoSED METHoD
Theproposedmethoddetectscopy-moveforgerybasedonSIFTfeaturesandagglomerativehierarchicalclustering.Theestimationofthresholdthatcaneffectivelyandefficientlyhandlethemostcommonissuesrelatedtokeypoint-basedmethodsisdoneautomatically.Inourexperiment,CLAHEbasedonRayleighdistributionfunctionisapplied.Cliplimitandtilesizearesetto0.01and4×4respectively.AsshowninFigure4,thewholedetectionprocessoftheproposedmethod.Inourdetectionscheme,theimageisfirstpreprocessed.Then,featuresareextractedusingSIFT.DistinctiveSIFTkeypointsarethenmatchedbetweeneachotherusingfastapproximatenearestneighbormethod.Afterthat,hierarchicalclusteringisappliedonthematchedpoints.Finally,theimageisfiltered,andtamperedregionsarelocalized.Eachstepofthedetectionmethodwillbedescribedindetailinthefollowingsubsections: section3.1describes the imagepreprocessing, section3.2 introducesSIFT featuresextraction,section3.2illustrateskeypointmatching,section3.3presentstheclusteringandforgerydetection,section3.4showstheprocessofestimatingaffinetransformationandfinallysection3.5illustrateshowtamperedregionsarelocalized.
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
6
3.1. Image Preprocessing by CLAHEAsmentionedearlier,SIFTfeatureextractionalgorithmcannotdetectenoughkeypointsinflatareas.Tolimitthisissue,inourwork,Contrast-LimitedAdaptiveHistogramEqualization(CLAHE)(Maetal.,2017)isusedtoenhancethecontrastoftheimage.CLAHEislocallyadaptivecontrastenhancementmethod.IncontrasttoglobalHE,CLAHEworkslocallyinsmallareascalledtilesasopposedtotheentireimage.Eachtile’scontrastisimproved,sothattheprocessedhistogramregionapproximatelymatchesthehistogramspecifiedbyadistributionfunction(e.g.uniform,Gaussian,orRayleigh).BeforecomputingtheCumulativeDistributionFunction(CDF)inCLAHE,thehistogramisclippedataspecificvaluewhichlimitsthenoiseamplification.CLAHEdependsontwomainparameters:Block(tile)SizeandClipLimit.Thequalityoftheimprovedimageiscontrolledmainlybytheseparameters.Rayleighdistributionisoneofthecommonlyusedhistogramclips,whichproduceabell-shapedhistogram(Maetal.,2017).Rayleighdistributionfunctionisgivenby:
y i yp imin( ) = + ( ) −− ( )
2 1
1
12α ln (1)
whereymin
isthelowerboundofthepixelvalue,α isascalingparameterofRayleigh,and p i( ) iscumulativeprobabilitywhichisprovidedtocreatetransferfunction.Ahighervalueofα willresultinmoresignificantcontrastenhancementintheimage,increasingsaturationvalueandamplificationofnoiselevels.
Figure 4. Framework of the proposed copy-move forgery detection method
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
7
UsingCLAHEoffersmanyadvantages;easytoimplement;simplecalculation,anditprovidesgoodresultsinimage’slocalareas.CLAHEcanlimitbrightnesssaturationthatusuallyresultsfromHEandresultsinlessnoise(Kumar&Sharma,2008).
ItworthnoticingthatapplyingCLAHEgenerallyonallimageswillnotalwaysresultinsatisfyingresults.Some imageregionsmaybecomeoversaturatedordarkerwhichwillcriticallyaffect thetampering detection results. Thus, CLAHE will only be applied on low contrast images. In ourexperiment,CLAHEbasedonRayleighdistributionfunctionisapplied.Cliplimitandtilesizearesetto0.01and4×4respectively.
3.2. SIFT Features Extraction and DescriptionTheproposedmethodisbasedonaneffectivekeypointdetectoranddescriptorcalledScaleInvariantFeatureTransform(SIFT)(Lowe,2004).SIFTalgorithmextractsdistinctivefeatures(orlocalfeatures)indigitalimagesthatareinvarianttoimagescalingandrotationandproviderobustnesstochangesinillumination,distortion,noiseaddition,and3Dviewpoint(Lowe,2004).ItisappliedtotheinputimagetoextractSIFTdistinctivekeypointswhicharerepresentedwith128-dimensionalfeaturevector.Asdescribedin(Lowe,2004),thefollowingarethemajorstagesofSIFTalgorithmexplainedbriefly:
• Scale-space peak selection:Thegoalofthisstageistoidentifylocationsandscalesthatcanbefrequentlyassignedunderdivergentviewsofthesameobject.ThatisdonebydetectingextremausingadifferenceofGaussianfunctionatdifferentscalesoftheimage;
• Keypoint localization:Someofkeypointcandidatesresultfromthepreviousstageareunstable(i.e.theyliealonganedge,ortheyhavealowcontrast).Forthatadetailedmodelisfittothenearbydataforaccuratelocation,scale,andcontrast.So,unstablekeypointsarerejectedandhenceincreasetheefficiencyandtherobustnessofthealgorithm.Attheendofthisstageobtainedkeypointsarestableandscaleinvariance.Formoredetails,see(Lowe,2004);
• Orientation assignment:Basedonlocalimageproperties,oneormoreorientationsareassignedtoeachkeypoint.Aroundeachkeypointgradientdirectionandmagnitudearecalculated.Thenthemostprominentgradientorientation(s)areidentifiedandassignedtothatregion.Now,keypointsthatarerotationinvarianceareobtained;
• Keypoint descriptor: After assigning location, scale and orientation for each keypoint, adescriptor is computed for the local image region based on a window around the detectedkeypoint.Therefore,theoutputofthisstepisauniqueSIFTkeypointsthatarerepresentedwith128-dimensionaldescriptorvectors.Keypointsdescriptorishighlydistinctiveanditisinvarianttoscaling,rotation,illuminationchangeand3Dviewpoint.
3.3. Keypoint MatchingInacopy-moveforgery,keypointsextractedfromtheoriginalandduplicatedregionshavethesamedescriptorvectors.Therefore,matchingbetweenthemisappliedtoauthenticatecopy-moveforgeriesintheimage.Usuallymatchingbetweendetectedkeypointsisdoneusingg2NNasin(Amerinietal.,2013;Dadaetal.,2016;Lietal.,2015;Mohamdian&Pouyan,2013).Anyway,itisknowntosufferfromhighcomplexitywhenidentifyingthesimilarityfrommanyhighdimensionalvectors.Inaddition,itgiveslessaccurateresultsespeciallyinhighdimensionalspace.Moreover,lexicographicsortingyieldshigherfalsenegativerate(Christleinetal.,2010).Toaddresstheseissues,FastApproximateNearestNeighbor(FANN)methodintroducedbyMujaetal.(Muja&Lowe,2009)isused.ItisbasedonBest-Bin-First(BBF)searchwhichisavariantofaKD-treethatisusedforfindingapproximatenearestneighborswithhighestprobabilityandlesstime.Intheworkintroducedin(Christleinetal.,2010),ithasbeenshownthattheuseofKD-treematchinggenerallygivesabetterresultscomparedtolexicographicalsortingespeciallyinvery-high-dimensionalspace.TheideaofBBFistosearchinbinsofkd-treeinorderofdistancefromthequeryusingapriorityqueue.Thedistancetoabin
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
8
(node)istheminimumdistancebetweenthequeryandanyotherpointonthebinboundary.Ateachinternalnodevisitedstorethepositionandthedistanceinthequeueinsteadofbacktracking,poptheclosestdistancefromthequeueandcontinuefromit(Lowe,2004).
Given a test image, a set of keypoints F f fn
= … 1, , and its corresponding descriptor
D D Dn
= … 1, , areextracted.Matchingoperation isperformedbetweenfeaturedescriptors to
identifythesimilaritybetweenthem.FeaturesdescriptorisusedtobuildtheKD-tree.AfterthatBBFsearchisappliedonthekd-treetofindtheNnearestneighborsofeachkeypoint f
ifromallother
(n-1)keypointsoftheimage.NearestneighboriscomputedusingEuclideandistance.LetsortedEuclideandistancewhichisknownassimilarityvectordenotedbyd d d d
n= … − , , ,
1 2 1.Assuggested
by(Lowe,2004),theratiobetweenthedistanceofclosestneighbortothedistanceofthesecondclosestneighboriscalculatedandthentheresultiscomparedtoapredefinedthresholdT(usuallyrangefrom0.3to0.5)toreducefalsematches.Therefore,thekeypointismatchedonlyifthefollowingconstraintissatisfied:
dd
T where T12
0 1< ( ) , , (2)
Sincecopy-moveforgeriesmayhavesameimageareathatisclonedmultipletimes,itisnecessarytohandlethiscase.FANNalgorithmcanmanagemultiplecopy-moveforgery.Attheendofthisstage,allthematchedkeypointsarekeptandisolated,andonesarediscarded.
3.4. Clustering and Forgery DetectionInorder to identifypossible forgedregionsandgroupspatiallyclosedkeypoints,AgglomerativeHierarchicalClustering(AHC)(Hastieetal.,2003)isappliedonspatiallocationsofmatchedfeaturepairs.AHCisabottom-upclusteringmethod.Hierarchicalclusteringcreatesahierarchyofclusterswhereclustershavesub-clusters,whichinturnhavesub-clusters,etc.Itstartswitheachkeypointinitsownsingletoncluster.Pair-wisedistancesbetweenclustersarethenevaluated.Itagglomerates(merges)theclosestpairofclusterswithshortestdistance.Thisprocessisrepeateduntilallclustershavebeenmergedintoasingleclusterthatcontainsallkeypoints.AHCcanbevisualizedusingatree-likediagramcalledadendrogram.Itshowstheprogressivegroupingofthekeypoints.Itisthenpossibletodeterminewhatsuitablenumberofclustersis.Generally,mergingofthekeypointsisdeterminedbytwocriteria:thelinkagemethodandthecutoffthresholdusedtostopclustering.Cutoffthresholdplaysakeyroleinforgerydetectionanditcriticallyaffectstheresults.Often,theproblemofcuttingoffadendrogramhasbeenadifficultissueinhierarchicalclusteringresearches(Abeetal.,2017).Usually,settingcutoffthresholdrequiresmanyexperimentsandoptimizationasin(Amerinietal.,2011).Moreover,determiningcutoffthresholdissotrickybecauseofdifferentsizeofimagesanddifferentsizeofforgedregions.Tohandlethisissue,theproposedmethodusesdynamiccutoff,whichautomaticallyterminatestheclusteringprocessonceoptimalnumberofclustersareobtained.Itestimatesoptimalnumberofclustersbasedoninternalclustervaliditymeasures.Generally,clustervalidationisusedtoverifythatanyfoundclusterisreallyindataratherthanbeingproducedfromalgorithmsartifacts(Everittetal.,2011).Clustervalidationcanbedividedintotwomainmethods:Internal and external (Halkidi et al., 2002). Internal methods measure cluster quality based oninter-clusterseparationandintra-clustercompactness(cohesion).Inourwork,twocommonlyusedinternalmethodsforAHCareintroduced:gapstatistics(Tibshirani,2001)andsilhouettewidth(Gan,2007).Theproposedmethodiscomparedusinggapstatisticandsilhouettewidth(seesections4.3and4.4foradetaileddescriptionofsuchexperiment).Mathematicaldetailsforbothgapstatisticandsilhouettecoefficientaregiveninsection3.4.1and3.4.2respectively.Forthelinkagemethod,the“Ward”linkageisusedintheproposedmethodasitgivesthebestresults(Amerinietal.,2011).
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
9
Ward’smethodsupposesthataclusterisrepresentedbyitscentroid.TheincreaseofSumofSquaresError(SSE)thatresultsfrommergingtwoclustersisusedbyWardmethodtomeasuretheproximitybetweentwoclusters.SumofsquaresinAHCstartsatzero(everypointisinitsownsingletoncluster)andgrowsasmergingtheclusters.Ward’smethodattemptstominimizeSSEdistancesofpointsfromtheirclustercentroidstokeepthisgrowthassmallaspossible.
Giventwoclusters,AandB,thedistancebetweenthemgivenby(Amerinietal.,2011):
∆ ( ) = ( )− ( )+ ( )
dist
A B SSE AB SSE A SSE B, (3)
where:
SSE A x xi
n
Ai A
A
( ) = −=∑
1
2 (4)
where∆ iscalledthemergingcostofcombiningtheclustersAandB,nA
isnumberofpointsinclusterA,x
AiindicatestheithpointinclusterA,andx
Aisthecentroid.
3.4.1. Gap StatisticThegapstatistic(Tibshiranietal.,2001)isusedtoestimatetheoptimalnumberofclusters.Theideaisthatfordifferentvaluesofkclusters,thechangesofthetotalwithinintra-clusterarecomparedwithitsexpectationunderanullhypothesis(i.e.adistributionwithnoclustering).Valueofkthatmaximizesthegapstatisticrepresentstheestimationoftheoptimalnumberofclusters.Thismeansthatthestructureofclusteringismuchfarfromtherandomuniformdistributionofpoints.Therearetwochoicesofreferencedistributioneitherbasedonuniformdistributionorbasedonprincipalcomponentanalysis(moredetailsin(Tibshiranietal.,2001).Forsimplicity,intheproposedmethod,uniformreferencedistributionisused.Thefollowingarehowgapstatisticsworkbriefly:
1. ClusterthedataintokclusterswithC C C Cr k= … 1 2
, , ., denotingtheindicesofdatainclusterr;
Computethepairwisedistanceforallpointsinclusterrsuchthat:
D dr
i i Cii
r
=′∈∑,
’ (5)
Computethetotalwithinintra-clustersuchthat:
wnD
kr
k
rr
==∑
1
1
2 (6)
4. Breferencedatasetsaregeneratedbasedonareferencedistributionmethod.Theneveryofthosereferencedatasetsisclusteredintokclusters;
5. Thecorresponding totalwithin intra-clustervariationW b B k Kkb* , , , ., , , , ,= … = …1 2 1 2 is
computed;Theestimatedgapstatisticiscomputed:
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
10
Gap kB
W wb
B
kb k( ) = ( )− ( )=∑1
1
*log log (7)
Computethestandarddeviationsdk
anddefinethestandarderrorsk
:
sdB
W lk
bkb
=
( )−
∑1 2
1
2
*log (8)
s sdBk k
= +
11 (9)
where:
lB
Wb
kb=
( )∑1
log * (10)
8. Finally,thesmallestvalueofkisselectedastheoptimaltheoptimalnumberofclusters k inwhichthegapstatisticiswithinonestandarddeviationofthegapatk+1:
k smallest k such that Gap k Gap k sk
= ( ) ≥ +( )− + , 11
3.4.2. Silhouette WidthTheSilhouettewidth(Gan,2007)canbeusedtorepresentthecompactnessandseparationofclusters.Itmeasurestheclosenessbetweeneverypointinoneclustertothepointswithintheneighboringclusters.Fordifferentvaluesofk,theaverageSilhouetteofobservationsiscomputed.Suchthat,theonethatmaximizestheaverageSilhouetteoverallpotentialkvaluesrepresenttheoptimalnumberofclustersk.Supposepointx
ibelongstoclusterk.First,lettheaveragedistancebetweenthispoint
andallothersinthesameclusterrepresentedbyai.Second,lettheaveragedistancebetweenthis
pointandthoseincluster l k≠ representedby dli
.Lettheaveragedistancebetween xiandthe
nearestclusterofwhichitisnotamemberisdenotedby,b min di l l= .Therefore,theSilhouettewidth
isgivenby:
s xb ak
a bii i
i i
( ) = −
max , (11)
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
11
Largervalueofs xi( ) meansthatx
iliesbetterinitscluster.Entireclusteringstructureismeasured
by the average Silhouette of all data points (also called Silhouette Width Criterion SWC). BestclusteringstructurecorrespondstomaximizedSWC.Itisgivenby:
SWCN
s xi
N
i= ( )
=∑
1
1
(12)
3.5. Affine Transform EstimationPreliminarily,weknowthatimageistamperedandwherethesourceregionandtarget(copy-moved)region. Since copy-move regions undergo geometric distortions (such as scaling, rotation), therelationship between tampered regions is estimated using affine transformation specified by atransformationmatrixH.Giventhecoordinateoftwocorrespondingmatchedpointsfromaregion
anditsduplicateasX x yT
= ( ), and X x y T= ( , ) ,respectively.Thegeometricrelationshipbetweenthesetworegionsisexpressedas:
X HX= (13)
Itcanbeexpressedinmatrixformas:
x
y
a a t
a a tx
y
1 0 0
11 12
21 22
=
1 1 1
=
x
y H
x
y
(14)
wheretheparametersa11
,a12
,a21
,a22
,areassociatedwithscalingandrotation,while tx
, ty
areassociated with translation. Since affine transformation has six degrees of freedom (six matrixparameters),thereisaneedforatleastthreematchedpairsnottobecollineartoobtaintheaffinetransformation.Givenasetofcorrespondences X X
n1, ,…( ) and X X
n1, .,…( ) ,thetransformation
matrixHcanbecomputedbymeansofminimizingthefollowingtotalerrorfunction:
i
n
i iX HX
=∑ −
1
2 (15)
Mismatchedpointsoroutlierscansolelyaffect theestimatedhomographyH.Inthatcase,awidelyusedmethodforrobustestimationcalledRandomSampleConsensus(RANSAC)(Fischler&Bolles,1981)isemployedtoperformthepreviousestimationmoreaccurately.ThismethodisalsoadoptedbysomeotherCMFDtechniques(Amerinietal.,2011;Dadaetal.,2016).RANSACrandomlyselectsthreepairsofpointsormorefromthematchedpointsandestimatesthehomographymatrixHbyminimizingthetotalerrorfunctiongiveninequation(15).Then,accordingtoH,alltheremainingpointsaretransformed.Afterthat,allpairsofmatchedpointsareclassifiedintoinliersoroutliersdependingontheestimatedmatrixH.Apairofmatchedpointsconsideredasinlierif X HX− ≤ β ,otherwise, it is outlier. After N
i times of iteration, RANSAC returns the estimated transform
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
12
parametersthatresultinthebiggestvarietyofinliers.Inourexperiment,β issetto3andNiisset
to1000.Finally,asetofaffinetransformations H Hn1
, ,… areacquired.EstimationofaffinetransformationisanimportantstageofCMFDscheme.Falselydetected
regions thatdonothaveasetofpointswithuniformtransformrelationship,willbe removed inthisstage.Moreover,itenhancesthedetectionofcopy-moveforgerybyprovidingthemoredetailsabouttamperedimage.Inliterature,mostofrecentCMFDmethodschoosetoestimategeometrictransformations(Amerinietal.,2011;Shivakumar&Baboo,2011;Dadaetal.,2016;Hashmietal.,2014;Lietal.,2015).
3.6. Tampered Regions LocalizationOnceanimageisconsideredastampered,duplicatedregionscanbeaccuratelylocalized.WiththeestimatedaffinetransformationH,identicalregionsarefoundbycomparingeachpixelintheimagewithitstransformationalcounterpart.Intheproposedmethod,alocalizationmethodisusedasin(Yangetal.,2017).Localizingtamperedregionsisillustratedinthefollowingsteps:
1. AllpointsintheimagearetransformedforwardwiththematrixH andbackwardwiththeinversematrixH −1 letoriginalregionRo anditsrelatedtamperedregionRF thentherelationbetweenthesetworegionsintermsoftransformationmatrixis:
R HR R H RF o o F= = −, 1
Tolocalizethetamperedregions,thesimilaritybetweentheoriginalimageIandtransformed(warped)imageTismeasuredusingZeroMeanNormalizedCross-Correlation(ZNCC).Letpixelintensitiesatlocationxdenotedas I x( ) andT x( ) ,then:
ZNCC xI v I T v T
I v I T v T
Zv x
v x
( ) =( )−( ) ( )−( )
( )−( ) ( )−( )∈ ( )
∈ ( )
∑
∑
Ω
Ω
2 2, NNCC ∈ ( )0 1, (16)
whereΩ x( ) a7×7pixelsneighboringareacenteredatlocation x . I v( ) andT v( ) arethepixelsintensitiesatlocation v . I andT aretheaveragepixelintensitiesofIandTcomputedatΩ x( ) .LargervalueofZNCCindicateshighsimilarity.
2. Correlationmapisnowobtained,toreducenoise,Gaussianfilterofsize7pixelsisapplied.Then,itisconvertedtobinaryimagewithathresholdvalue(th=0.55inourexperiments);
3. Finally,morphologicaloperationisappliedtofilltheholesinthebinaryimage.
4. EXPERIMENTAL RESULTS
Inthissection,theperformanceoftheproposeddetectionmethodisevaluated.Section4.1describesthetestimagedatasetusedinourexperiments.Errormeasuresareintroducedinsection4.2.ResultsonMICC-F220andbenchmarkdatasetsareillustratedinsection4.3and4.4,respectively.Acomparisonoftheproposedmethodandotherrelatedmethodsisgiveninsection4.5.Insection4.6,therobustness
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
13
oftheproposedmethodtoprocessingoperationsisshown.Finally,section4.7showstheabilityofproposedmethod todetectmultiple copy-move forgeries.Allmeasurements areperformedonadesktopcomputerwithIntelCorei51.7GHzCPUand4GBRAMmemory,runningMatlabR2016b.
4.1. Test Image DatasetInourexperiments,twopubliclyavailabledatasetsareused.BotharethemostwidelyuseddatasetsbyCMFDmethods(Bakiahetal.,2016).ThefirstdatasetisMICC-F220introducedbyAmerini(Amerinietal.,2011).Ittotallyconsistsof220images:110areoriginalimagesand110aretampered.Theimageresolutionrangesfrom722×480to800×600pixels.About1.2%oftheentireimageiscoveredbyaforgedregion.Theprocessingoperationinthisdatasetonlylimitedtotranslation,scalingandrotation.Thegroundtruthofthisdatasetisnotgiven.Theseconddatasetusedbenchmarkdatasetcreatedby(Christleinetal.,2012).Thisdatasetisconsistingof48basicimagesandtheirtransformedimages,suchasrotation,scaling,JPEGcompressionandadditivenoise.ParametersappliedoneachattackisgiveninTable1.Imagesofthisdatasetarequitelarge,itsaveragesizeisabout3000×2300pixels.About10%oftheentireimageiscoveredbyaforgedregion.Inthisdataset,theduplicatedregionsaremeaningful,whicharecategorizedaseither:rough(e.g.,rocks),smooth(e.g.,sky),orstructured(Christleinetal.,2012).Thisdatasetisprovidedwithagroundtruth.Table2illustratesthedetailsabouteachdataset.
Inourexperiments,totally1756imageshavebeentested.Thereare220imagesfromMICC-F220dataset.FromBenchmarkingdataset: (1) 48original images, (2)Plain copy-move: 48 tamperedimageswithoutanyprocessingoperationsapplied.(3)Rotation:theduplicatedregionsarerotatedbyanglevaryingfrom2°to10°withsteplength2°.Thismeansthattherearetotally48×5=240images.(4)Scaling: theduplicatedregionsarerescaledwithratiobetween91%and109%of its
Table 1. Setting of attacks parameters
Attacks Parameters
Rotation Angle(2°:2°:10°)
Scaling Ratio(0.91:0.02:1.09)
JPEGCompression QualityFactor(20:10:100)
Noiseaddition Deviation(0.02:0.02:0.1)
Table 2. Details about image datasets used
Dataset Image Size Total Images Processing Operations Ground Truth
MICC-F220 722×480to800×600
Total:220 -Translation-Rotation-Scaling(symmetric,asymmetric)-Combinedtransformation
No
Original110
Tampered110
BenchmarkCMFD 420×300to3888×2592
Original48 -Rotation-Scaling-JPEGCompression-Noise-Combinedtransformation
Yes
Tampered-48(plainCMF)-240(Rotated)-480(Scaled)-423(JPEG)-240(Noise)
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
14
originalsize,withsteplength2%.Thatresultsin48×10=480images.(5)JPEGCompression:Theforgedimagesarecompressedwithqualityfactorsvaryingfrom20to100withsteplength20.Inthiscase,thereare84×9=432images.(6)Noiseaddition:noiseisaddedtoduplicatedregionswithstandarddeviationvaryingfrom0.02to0.1withsteplength0.02.Therefore,totallythereare240images.(7)MultipleCopies:foreachofthe48images,ablocksizeof64×64pixelsareselectedandrandomlycopiedfivetimes.
4.2. Error MeasuresTotesttheperformanceofourCMFDmethod,wefollowtheapproachpresentedby(Christleinetal.,2012).PerformanceofCMFDschemeisevaluatedattwolevels:(1)Imagelevel:theabilitytodetermineiftheimagehasbeentamperedornot.(2)Pixellevel:theabilitytolocalizetamperedregionscorrectly.CommonlyusedevaluationmetricsinCMFDtocalculatetheaccuracyareTruePositiveRatio(TPR)andFalsePositiveRatio(FPR)seeTable3.ACMFDtechniqueisefficientifitmaintainsahighTPRwhiletheFPRattheminimumlevel.ThecalculationforTPR(recall),FPRandprecisiongivenas:
Precision =+TP
TP FP (17)
Recall =+TP
TP FN (18)
FPR =+FP
FP TN (19)
Intheproposedmethod,ifatleastoneaffinetransformationisestimatedbetweenforgedimageregions,animageistakenintoaccountasforged.Inaddition,F1scoreisusedasanevaluationmetric(Christleinetal.,2012),whichcombinesprecisionandrecallintoasinglevalue:
F12= ⋅
⋅+
precision recall
precision recall
(20)
Table 3. Evaluation measures description
Measures Description
TP(Truepositive) Correctlydetectedforgedimages
TN(TrueNegative) Correctlydetectedoriginalimages
FP(Falsepositive) Originalimagesfalselydetectedasforged
FN(FalseNegative) Falselymissedforgedimages
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
15
Atpixellevel,thesameevaluationmetricsareusedwhereTPrepresentsnumberofpixelsthatarecorrectlydetectedasforged,FPnumberofpixelsthatarefalselydetectedasforgedandFNshowsfalselymissedforgedpixels.
4.3. Results on MICC-F220 DatasetItworthtonoticethat imagesizeandforgedregionsizeplayanimportantrolewhenevaluatingforgerydetectionmethod.Inthisdataset,imagesizesarenotsolarge.Asmentionedearlier,imagessizeisvaryingfrom722×480to800×600.Theforgedregionsaresmallcomparedtothewholeimage.Itwasfoundthatthesmallerthesizeoftheimageandtheforgedregionis,thelesserthenumberoffeaturesandmatchedpointsarefound.Inourmatchingmethod,thethresholdvalueissetto0.5thatyieldsthebestresults.Lowerthresholdvaluewillresultinsmallmatchedpointswhilelargervalueswillresultinhighfalsematches.Fortheclustervaliditymeasures,bothgapstatisticsandSilhouettewidthareapplied,andtheresultsarecompared.Thekvalue(clusterslist)isvaryingfrom1to7.LargerangeofkwillresultinlowTPRandhighFPRbecauseclustersizesdependhighlyonnumberofmatchedpoints.SilhouettewidthproducesthebestresultsintermsofTPRandFPRasillustratedinTable4.
4.4. Results on Benchmark DatasetAs previously mentioned, the image size of this dataset is relatively large. Its average size is3000×2300.Largeimagesaremorechallenging,sinceanoverallhighernumberoffeaturevectorsexists,andthusthereisahigherprobabilityoffalsepositives.Inthiscase,thematchingthresholdissetto0.4.Toavoidhighfalsepositives,thevalueofkissetfrom1to20.Theabilityoftheproposedmethodisexaminedindifferentcases:plaincopy-moveforgery,robustnesstoprocessingoperations:rotation, scaling, JPEGcompressionandnoise additionand inmultiplepasted regions.WealsocompareresultsusinggapstatisticsandSilhouettewidth.Inthisdataset,gapstatisticsrepresentthebestresults.Figure5showssomeexamplesoftamperedimagesdetectedbytheproposedmethod.ThelocalizationdetectionresultsonthetestimageswithCMFregionsfusedinthebackgroundareillustratedinFigure6.TheproposedmethoddetectsmostCMFregions.
4.5. Comparisons with other Relevant MethodsToverify theperformanceof theproposedmethod,our results is comparedwithvarious recentkeypoint-basedmethods.AllcomparisonsaremadeusingMICC-F220datasetandthebenchmarkdataset.Themethodsusedforcomparisoninclude:panandLyu(Pan&Lyu,2010),Amerirniet.al(Amerinietal.,2011),ShivakumarandBaboo(B.L.Shivakumar&Baboo,2011),Liet.al(Lietal.,2015),Hashmiet.al(Hashmietal.,2014),andYangetal.(Yangetal.,2017).UsingMICC-F220dataset,theproposedmethodiscomparedwithmethodsin(Amerinietal.,2011)and(Hashmietal.,2014).Theproposedmethodandthemethodpresentedin(Amerinietal.,2011)achievebestTPR,buttheproposedmethodachieveslessFPRasshowninTable5.Theprocessingtimeoftheproposedmethod(perimage)onaverageabout0.43seconds,whereastheothertwomethodstakeabout4.94and2secondsrespectively.Usingbenchmarkingdataset,theperformanceincaseofplaincopy-moveforgeryisevaluated.Insuchcase,48originalimagesand48forgedimagesareusedwithoutapplyinganyprocessingoperations.Inthiscase,theCMFDmethodsmustdifferentiatewhethertheimagehas
Table 4. TPR and FPR values on MICC-F220 with respect to cluster evaluation method
Method TPR FPR
Gapstatistics 95.45% 8.18%
Silhouettewidth 100% 6.36%
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
16
beentamperedornot.Detectionresultsofplaincopy-moveforgeryareshowninTable6.Notethattheproposedmethodandthemethodpresentedin(Lietal.,2015)obtainthebestresultsintermsofrecallrate,100%and97.72%,respectively.Forprecisionrate,theproposedmethodobtainsthebestresultreachingto90.9%,followedbythemethodpresentedin(Hashmietal.,2014)of88.89%.Inaddition to theprecisionand recall, theproposedmethodachieves thebest F
1 scoreof95.23%
comparedtoothermethods.Inconclusion,theproposedmethodobtainsthebestresultsintermsofrecall,precisionandF
1corecomparedtoothermethods.
4.6. Robustness to Processing operationsTheproposedmethodhasadditionallybeentestedintermsofdetectionperformancefromarobustnesspointofview;theimpactofrotation,scaling,JPEGcompressionandnoiseadditionusingbenchmarkdataset(Christleinetal.,2012):
1. Robustness to gaussian noise:Imageintensitiesbetween0and1arenormalizedandzero-meanGaussiannoisewithstandarddeviationsof0.02,0.04,0.06,0.08and0.10isadded.Itcanbenoticedthatasstandarddeviationincreasedthefalsenegativesalsoincreasedandtruepositivesdecreased.Inaddition,astandarddeviationof0.10leadstoobviouslyvisibleartifacts(Christleinetal.,2012).Generally,theproposedmethodmaintainshighTPRasillustratedinTable7;
2. Robustness to JPEG compression artifacts:Thequalityfactorsvariedbetween100and20withsteplength10.ThesameJPEGcompressionappliedto48forgeriesperqualitylevel.Thevisualqualityoftheimageishighlyaffectedwhenthequalityfactorisverylow.Forreal-worldforgeries,qualitylevelsdownatleast70areconsideredasacceptableassumptions(Christleinet al., 2012).TPR tends to slightly reducewhen imagequalitydecreases.ProposedmethodmaintainshighTPRasillustratedinTable7;
Figure 5. Examples of tampered images detected by proposed method
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
17
Figure 6. Detection results on the images with copy-move forgery regions combined in the background. The first column represents the test images from benchmark dataset. The second column represents the ground truth of the copy-move forgery regions in these images. The third column shows the localization detection results of the proposed method.
Table 5. Detection results on MICC-F220 dataset and processing time (average, per image)
Method TPR (%) FPR (%) Time (s)
Amerirniet.al(2011) 100 8 4.94
Hashmiet.al(2014) 80 10 2
Proposed 100 6.36 0.43
Table 6. Detection results of the plain copy-move forgery on benchmark dataset
Methods Precision (%) Recall (%) F1
PanandLyu(2010) 80.49 68.75 74.15
Amerirniet.al(2011) 88.4 79.2 79.2
ShivakumarandBaboo(2011)
77.27 70.83 73.90
Liet.al(2014) 70.17 83.33 76.19
Hashmiet.al(2014) 88.89 80 84.21
Yanget.al(2017) 78.33 97.92 87.04
Proposed 90.90 100 95.23
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
18
3. Robustness to rotation and scaling:FlexibilityofCMFDalgorithmstoaffinetransformations,likescalingandrotationisveryimportant.Theproposedmethodshowsstronginvariancetoscalingandrotation.Performanceoftheproposedremainsrelativelystableacrossthewholescalingparametersanddifferentrotationangles.Proposedmethodresultsin100%TPRforbothscalingandrotationseeTable7.
4.7. Detection of Multiple CopiesIntheproposedmethod,weaddressthedetectionofmultiplecopiesofthesameregion.Asmorecombinationsofmatchedregions,thechanceoffalsematchincreases.Table8illustratesthedetectionresultsofourproposedmethodofmultiplecopiesintermsofTPRusingdifferentthresholdvalues.Generally,theperformancewilldecrease,mainlysincetherandomchoiceofsmallblockstypicallyyieldsregionswithonlyafewmatchedkeypoints(Christleinetal.,2012).
5. CoNCLUSIoN
Detectingcopy-moveforgeryisachallengingtask.DependenceofempiricalthresholdsinexistingCMFD techniques is a critical issue. This paper proposes a CMFD method that automaticallydeterminestheoptimalnumberofclustersusinginternalclustervaliditymeasures.ThemethodusesGapStatisticandSilhouettewidthforautomaticcutoffthresholdinAHC.TheproposedmethodhasbeenevaluatedusingtwopubliclyavailabledatasetsMICC-F220andBenchmarkdataset.Experimentalresultsexhibitthattheproposedmethodyieldshigherprecisionandrecall,andlowerfalsenegativevaluescomparedtoalternativesimilarworkswithintheliterature.Italsoshowsrobustnessagainstdifferentattackssuchasscaling,rotation,JPEGcompression,andadditivenoise.Futureworkcanbeextendedtoworkonvideosandhandleothertypesofimageforgerylikeimagesplicing.
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
19
Table 7. Robustness to processing operations
Attacks Total Number of Images Parameters TP Recall (%)
Rotation(angle)
2402°
48 100
4° 48
6° 48
8° 48
10° 48
Scaling(ratio)
480 0.91 48 100
0.93 48
0.95 48
0.97 48
0.99 48
1.01 48
1.03 48
1.05 48
1.07 48
1.09 48
JPEG Compression(qualityfactor)
432 20 45 98.61
30 47
40 46
50 48
60 48
70 48
80 48
90 48
100 48
Noise(standarddeviation)
240 0.02 47 96.67
0.04 47
0.06 46
0.08 46
0.10 46
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
20
Table 8. Results for multiple copy-move forgery
Threshold TPR (%)
0.4 93.75
0.5 95.83
0.6 97.92
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
21
REFERENCES
Abe,R.,Miyamoto,S.,&Hamasuna,Y.(2017).HierarchicalClusteringAlgorithmswithAutomaticEstimationofTheNumberofClusters.Proceedings of the17th World Congress of International Fuzzy Systems Association and 9th International Conference on Soft Computing and Intelligent Systems (IFSA-SCIS)(pp.1–5).AcademicPress.doi:10.1109/IFSA-SCIS.2017.8023241
Al-Qershi,O.M.,&Khoo,B.E.(2013).PassiveDetectionofCopy-MoveForgeryinDigitalImages:State-Of-The-Art.Forensic Science International,231(1–3),284–295.doi:10.1016/j.forsciint.2013.05.027PMID:23890651
Amerini,I.,Ballan,L.,Caldelli,R.,DelBimbo,A.,DelTongo,L.,&Serra,G.(2013).Copy-MoveForgeryDetection and Localization by Means of Robust Clustering With J-Linkage. Signal Processing Image Communication,28(6),659–669.doi:10.1016/j.image.2013.03.006
Amerini,I.,Ballan,L.,Caldelli,R.,DelBimbo,A.,&Serra,G.(2011).ASIFT-BasedForensicMethodforCopy-MoveAttackDetectionandTransformationRecovery.IEEE Transactions on Information Forensics and Security,6(3Part2),1099–1110.doi:10.1109/TIFS.2011.2129512
Bakiah,N.,Warif,A.,Wahid,A.,Wahab,A.,Yamani,M.,Idris,I.,&Shamshirband,S.et al.(2016).Copy-moveforgerydetection:Survey,challengesandfuturedirections.Journal of Network and Computer Applications,75,259–278.doi:10.1016/j.jnca.2016.09.008
Cao,Y.,Gao,T.,Fan,L.,&Yang,Q.(2011a).ARobustDetectionAlgorithmforCopy-MoveForgeryinDigitalImages.Forensic Science International,214(1–3),33–43.PMID:21813252
Cao,Y.,Gao,T.,Fan,L.,&Yang,Q.(2011b).ARobustDetectionAlgorithmforRegionDuplicationinDigitalImages.International Journal of Digital Content Technology and Its Applications,5(6),95–103.doi:10.4156/jdcta.vol5.issue6.12
Christlein,V.,Riess,C.,&Angelopoulou,E.(2010).AStudyonFeaturesfortheDetectionofCopy-MoveForgeries.Sicherheit,105–116.
Christlein,V.,Riess,C.,Jordan,J.,Riess,C.,&Angelopoulou,E.(2012).AnEvaluationofPopularCopy-MoveForgeryDetectionApproaches.IEEE Transactions on Information Forensics and Security,7(6),1841–1854.doi:10.1109/TIFS.2012.2218597
Dada,A.,Dharaskar,R.V.,&Thakare,V.M.(2016).ASurveyonKeypointBasedCopy-PasteForgeryDetectionTechniques.Procedia Computer Science,78,61–67.doi:10.1016/j.procs.2016.02.011
Everitt, B. S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster Analysis (5th ed.). London: Wiley.doi:10.1002/9780470977811
Farid,H.(2006).ExposingDigitalForgeriesinScientificImages.Proceedings of the8th Workshop on Multimedia and Security(pp.29-36).ACM.
Fischler, M., & Bolles, R. C. (1981). Random Sample Consensus: A paradigm for Model Fitting withApplicationstoImageAnalysisandAutomatedCartography.Communications of the ACM,24(6),381–395.doi:10.1145/358669.358692
Fridrich,J.,Soukal,D.,&Lukáš,J.(2003).DetectionofCopy-MoveForgeryinDigitalImages.Proceedings of theDigital Forensic Research Workshop.AcademicPress.
Gan,G.(2007).Data Clustering Theory, Algorithms, and Applications(Vol.20).Alexandria,Virginia:Siam.doi:10.1137/1.9780898718348
Halkidi,M.,Batistakis,Y.,&Vazirgiannis,M.(2002).ClusteringValidityCheckingMethods:PartII.ACM Special Interest Group on Management of Data,31(3),40–45.
Hashmi,M.F.,Anand,V.,&Keskar,A.G.(2014).Copy-moveImageForgeryDetectionUsinganEfficientandRobustMethodCombiningUn-decimatedWaveletTransformandScaleInvariantFeatureTransform.AASRI Procedia,9,84–91.doi:10.1016/j.aasri.2014.09.015
Hastie,T.,Tibshirani,R.,&Friedman,J.(2003).TheElementsofStatisticalLearningData(2nded.).Springer.
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
22
Hu,J.,Zhang,H.,Gao,Q.,&Huang,H.(2011).AnImprovedLexicographicalSortAlgorithmofCopy-MoveForgeryDetection.Proceedings of the2nd International Conference on Networking and Distributed Computing(pp.23–27).AcademicPress.doi:10.1109/ICNDC.2011.12
Junhong,Z.(2010).DetectionofCopy-MoveForgeryBasedonOneImprovedLLEmethod.Proceedings of the 2nd International Conference on Advanced Computer Control (Vol. 4, pp. 547–550). Academic Press.doi:10.1109/ICACC.2010.5486861
Katta,R.M.R.,&Patro,C.S.(2017).InfluenceofWebAttributesonConsumerPurchaseIntentions.International Journal of Sociotechnology and Knowledge Development,9(2),1–16.doi:10.4018/IJSKD.2017040101
Kumar,R.,&Sharma,H. (2008).ComparativeStudyofCLAHE,DSIHE&DHESchemes. International Journal of Research in Management,Science & Technology,1(1),1–4.
Li,J.,Li,X.,Yang,B.,&Sun,X.(2015).Segmentation-BasedImageCopy-MoveForgeryDetectionScheme.IEEE Transactions on Information Forensics and Security,10(3),507–518.doi:10.1109/TIFS.2014.2381872
Lin,X.,Li,J.H.,Wang,S.L.,Liew,A.W.C.,Cheng,F.,&Huang,X.S.(2018).RecentAdvancesinPassiveDigitalImageSecurityForensics:ABriefReview.Engineering,4(1),29–39.doi:10.1016/j.eng.2018.02.008
Liu,G.,Wang,J.,Lian,S.,&Wang,Z.(2010).Apassiveimageauthenticationschemefordetectingregion-duplication forgery with rotation. Journal of Network and Computer Applications, 34(5), 1557–1565.doi:10.1016/j.jnca.2010.09.001
Lowe, D. G. (2004). Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision,60(2),91–110.doi:10.1023/B:VISI.0000029664.99615.94
Ma,J.,Fan,X.,Yang,S.X.,Zhang,X.,&Zhu,X.(2017).ContrastLimitedAdaptiveHistogramEqualizationBasedFusionforUnderwaterImageEnhancement.
Mahajan,P.,Delhi,N.,&Rana,A.(2018).SentimentClassification-HowtoQuantifyPublicEmotionsUsingTwitter.International Journal of Sociotechnology and Knowledge Development,10(1),57–71.doi:10.4018/IJSKD.2018010104
Mohamdian,Z.,&Pouyan,A.A.(2013).DetectionofDuplicationForgeryinDigitalImagesinUniformandNon-UniformRegions.Proceedings of the15th International Conference on Computer Modelling and Simulation(Vol.1,pp.455–460).AcademicPress.doi:10.1109/UKSim.2013.94
Muja,M.,&Lowe,D.G.(2009).FastApproximateNearestNeighborswithAutomaticAlgorithmConfiguration.Proceedings of theVISAPP International Conference on Computer Vision Theory and Applications(Vol.1,pp.331–340).AcademicPress.
Muliawaty,L.,Alamsyah,K.,Salamah,U.,&Maylawati,D.S.(2019).TheConceptofBigDatainBureaucraticServiceUsingSentimentAnalysis. International Journal of Sociotechnology and Knowledge Development,11(3),13.doi:10.4018/IJSKD.2019070101
Pan,X.,&Lyu,S.(2010).RegionDuplicationDetectionUsingImageFeatureMatching.IEEE Transactions on Information Forensics and Security,5(4),857–867.doi:10.1109/TIFS.2010.2078506
Qureshi,M.A.,&Deriche,M.(2015).ABibliographyofPixel-BasedBlindImageForgeryDetectionTechniques.Signal Processing: Image Communication, 39(PartA),46-74.
Redi,J.A.,Taktak,W.,&Dugelay,J.(2011).DigitalImageForensics:ABookletforBeginners.Multimedia Tools and Applications,51(1),133–162.doi:10.1007/s11042-010-0620-1
Sharma,S.,&Ghanekar,U.(2019).SplicedImageClassificationandTamperedRegionLocalizationUsingLocal Directional Pattern. International Journal of Image, Graphics and Signal Processing, 11(3), 35–42.doi:10.5815/ijigsp.2019.03.05
Shivakumar,B.L.,&Baboo,S.(2011).AutomatedForensicMethodforCopy-MoveForgeryDetectionbasedonHarrisInterestPointsandSIFTDescriptors.International Journal of Computers and Applications,27(3),9–17.doi:10.5120/3283-4472
International Journal of Sociotechnology and Knowledge DevelopmentVolume 12 • Issue 1 • January-March 2020
23
Aya Hegazi is a teaching assistant at faculty of computers and informatics, Benha University.
Ahmed Taha received his M.Sc. degree and his Ph.D. degree in computer science, at Ain Shams University, Egypt, in February 2009 and July 2015 respectively. He currently works as assistant professor at computer science department, Benha University, Egypt. His research interests are: computer vision and image processing (human behavior analysis - video surveillance systems), digital forensics (image forgery detection – document forgery detection), security (encryption – steganography – cloud computing), content-based retrieval (Arabic text retrieval - video scenes classification-video scenes retrieval – trademark image retrieval - closed-caption technology).
Mazen M. Selim received the BSc in Electrical Engineering in 1982, the MSc in 1987 and PhD in 1993 from Zagazig University (Benha Branch) in electrical and communication engineering. He is now a Professor at the faculty of computers and informatics, Benha University. His areas of interest are image processing, biometrics, sign language, content-based image retrieval (CBIR), face recognition, and watermarking.
Tibshirani,R.,Walther,G.,&Hastie,T. (2001).Estimating theNumberofClusters inADataSetVia theGap Statistic. Journal of the Royal Statistical Society. Series A, (Statistics in Society), 63(2), 411–423.doi:10.1111/1467-9868.00293
Tijdink,J.K.,Verbeke,R.,&Smulders,Y.M.(2014).PublicationPressureandScientificMisconductinMedicalScientists.Journal of Empirical Research on Human Research Ethics,9(5),64–71.doi:10.1177/1556264614552421PMID:25747691
Walia,S.,&Kumar,K. (2018).AnEagle-EyeViewofRecentDigital ImageForgeryDetectionMethods.Proceedings of the International Conference on Next Generation Computing Technologies (pp. 469–487).AcademicPress.doi:10.1007/978-981-10-8660-1_36
Yang,F.,Li,J.,Lu,W.,&Weng,J.(2017).Copy-MoveForgeryDetectionBasedonHybridFeatures.Engineering Applications of Artificial Intelligence,59,73–83.doi:10.1016/j.engappai.2016.12.022
Recommended