Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
DOI: 10.4018/IJMSTR.2017040101
International Journal of Monitoring and Surveillance Technologies ResearchVolume 5 • Issue 2 • April-June 2017
Copyright©2017,IGIGlobal.CopyingordistributinginprintorelectronicformswithoutwrittenpermissionofIGIGlobalisprohibited.
Use of Images of Leaves and Fruits of Apple Trees for Automatic Identification of Symptoms of Diseases and Nutritional DisordersLucas Garcia Nachtigall, Center for Technological Advancement, Federal University of Pelotas, Pelotas, Brazil
Ricardo Matsumura Araujo, Center for Technological Advancement, Federal University of Pelotas, Pelotas, Brazil
Gilmar Ribeiro Nachtigall, Embrapa Grape & Wine, Vacaria, Brazil
ABSTRACT
Rapiddiagnosisofsymptomscausedbypestattack,diseasesandnutritionalorphysiologicaldisordersinappleorchardsisessentialtoavoidgreaterlosses.ThispaperaimedtoevaluatetheefficiencyofConvolutionalNeuralNetworks(CNN)toautomaticallydetectandclassifysymptomsofdiseases,nutritionaldeficienciesanddamagecausedbyherbicidesinappletreesfromimagesoftheirleavesandfruits.Anoveldatasetwasdevelopedcontaininglabeledexamplesconsistingofapproximately10,000imagesofleavesandapplefruitsdividedinto12classes,whichwereclassifiedbyalgorithmsofmachinelearning,withemphasisonmodelsofdeeplearning.TheresultsshowedtrainedCNNscanovercometheperformanceofexpertsandotheralgorithmsofmachinelearningintheclassificationofsymptomsinappletreesfromleavesimages,withanaccuracyof97.3%andobtain91.1%accuracywithfruitimages.Inthisway,theuseofConvolutionalNeuralNetworksmayenablethediagnosisofsymptomsinappletreesinafast,preciseandusualway.
KeywoRDSApple, Apple Disorders, Artificial Intelligence, Automatic Disease Identification, Classifications, Convolutional Neural Networks, Disorders, Machine Learning
1. INTRoDUCTIoN
Approximately25%ofappleproductionislostbyattackofpests,diseasesandnutritionaldisordersofplants.Arapidandefficientdiagnosisofthesesituationsisessentialtoavoidgreaterlosses.Itisestimatedthat80to90%ofthedamagecausedbypestsanddiseaseswhichattackthecultureoftheappletreeoccursintheleavesandfruits.Amongthesediseases,scabofappletreeandthespotofGlomerellaarethemostimportantones(Valdebenito-Sanhuezaetal.,2008).Inthecaseofpests,whereleavesandfruitsserveasfoodsourceorhosts,themajorissuesareduetotheattacksofthefruitfly,fruitmothandbigcaterpillars(Kovaleski,2004).Ontheotherhand,thedisturbancescausedbytheexcessorlackofnutrientsarevisiblemainlyintheleavesduringthevegetativegrowthphase(Nachtigalletal.,2004).
Acorrectdiagnosisisessentialinordertodefinestrategiesofmanagementandcontrol,andconsequentlyfortherationaluseoffertilizersandpesticides.Onemainobstacletowardsaquick
1
International Journal of Monitoring and Surveillance Technologies ResearchVolume 5 • Issue 2 • April-June 2017
2
diagnosisistheneedfortrainedexperts,makingitcostlytocoverlargeareasinatimelymanner.Moreover,expertsoftenspecializeindifferentissues,increasingtherateofmisdiagnosis.
Someapproachesexisttotryandreducethedependencyonexperts.Awidelyusedoneisasimpleprintedguidecontainingphotosandexplanationsonhowtodiagnoseawiderangeofissues(Valdebenito-Sanhuezaetal.,2008).ExpertSystems,oftenbuiltontopofCase-BasedReasoningalgorithms,areappliedtosomecultures-e.g.vine(Fialhoetal.,2012).Thesesystemsstillrequireconsiderableamountsoftrainingandareoftennotveryaccurate,mainlyduetothetypicallyverylargenumberofquestionsrequiredtobeansweredbytheuserandthesensitivitytowronganswers.
Theconceptofusingmachinelearningtodetectsymptomsinplantshasbeenshowntobeapromisingalternativeinrecentyears,whereseveralstudiesusingdifferentapproacheshavebeencarried out to identify or classify symptoms in cultivated plants. Rumpf et al. (2010) aimed todiscriminatediseasedfromnon-diseasedsugarbeetleaves,todifferentiatebetweenthethreetypesofdiseases and to identifydiseases evenbefore specific symptomsbecamevisible.The authorsusedSupportVectorMachineswitharadialbasisfunctionaskerneltoperformtheidentificationandclassificationof symptomsofhealthyorunhealthy leaves.As input theyusednine spectralvegetationindexes,relatedtophysiologicalparametersasfeaturesforanautomaticclassification,resultinginclassificationaccuraciesupto97%onsugarbeetleavesanddiseasedleaves,upto86%classificationaccuracybetweenthethreediseasessymptomsandaccuracybetween65%and90%forpre-symptomaticdetectionofplantdiseases.
Al-Hiaryetal.(2011)proposedamethodologytoautomaticallydetectandclassifyplantleafdiseases from images. The process consists of six main phases: image acquisition, image pre-processing,imagesegmentation,featureextraction,statisticalanalysisandclassificationbyaMLP.Theauthorsused32samplesforeachofthesixclassesofleaves.Featuresweremanuallydefinedas10texturefeaturesextractedfromtheimage.
APerceptronMultilayerforclassifyinggrapeleafdiseaseswasusedbySannakkietal.(2013).Clusteringwasusedtosegmenttheimageintogroups,followedbylesionandmanuallydefinedfeatureextraction.Theyusedaverysmalldataset(33examples)andwereabletoachieveperfectaccuracyinanalso-small(4examples)testset.
RevathiandHemalatha(2014)focusoncottonleafspotdiseases.Theauthorsusedadatasetwith270imagesdividedinto6diseaseclasses.Featuresweremanuallydefined,consistingofleafedge,colorandtexturefeatures.ACrossInformationGainDeepForwardNeuralNetworkwasusedtoperformtheclassification,resultinginanoverallaccuracyof95%.Mohantyetal.(2016)proposedusingGoogLeNetfortheclassificationofleafdiseasesindifferentcultures,usingadatasetwith54306imagesfromdifferentlaboratories,dividedin38classes,andaclustercomputertoprocessthedata,resultinginaccuracyof99.35%.
Withthepopularizationofthetechniquesofartificialintelligenceandmachinelearningfortheclassificationofimages,largebanksofimageswerecreated,whichwereusedtotesttheefficiencyofthenewtechniquesdeveloped.Thus,severaltechniquesweresearchedforalowererrorrateintheclassificationoftheimagespresentinthisimagebank.OneofthetechniquesthatstoodoutforthisclassificationtaskwastheConvolutionalNeuralNetwork(CNN).
Krizhevsky;Sutskever;andHinton(2012)conductedthetrainingofadeepconvolutionalneuralnetworktoclassifyImageNetandobtainedthebestresulteverreportedusingthesub-setspresentedintheILSVRC-2010andILSVRC-2012competitions.Thearchitectureusedhadfiveconvolutedlayersandthreecompletelyconnectedlayers,withatrainingtimebetweenfiveandsixdays.Thisarchitectureobtainedanerrorrateinthetop-1andtop-5setsof38.1%and16.4%,respectively,usingasetoffivenetworks.
Basedon theresultsobtainedbyKrizhevsky;Sutskever;AndHinton(2012),SimonyanandZisserman(2014)mademodificationstotheproposedarchitectureusingsmallconvolutionalfilters(3x3)inordertoperformbetterthanthosepreviouslyproposed.Inaddition,anotherfocuswasonthedepthofthenetwork,whichcouldbeincreasedbythefactthatthefiltersweresmallerinsize.In
International Journal of Monitoring and Surveillance Technologies ResearchVolume 5 • Issue 2 • April-June 2017
3
thisarchitecturewerepresentsixteenconvolutedlayersandthreecompletelyconnectedlayers.Theerrorrateinthetop-1andtop-5setswere23.7%and6.8%,respectively,usingasetoftwonetworksandmulti-cropanddenseevaluationtechniques.
Moreover,CNNshavebeenshowntobegoodalternativesinthemostdifferentproblemsofcomputervision,suchascharacterrecognition(Lecunetal.,1998),facerecognition(Lawrenceetal.,1997)andobjectcategorization(Yu;Xu;andGong,2009).
Onthispaper,weextendtheworkdevelopedin(Nachtigaletal.,2016),aswecontinuetoanalyzethepossibilitytoautomaticallyidentifysymptomsofimportantdisorderspresentinappleorchards(Malus domesticaBorkh)usingonlyphotographsofleafandfruits,developinganewdatasetoflabeledimages,containingfivecommondisordersinleaves(deficiencyofpotassiumandmagnesium,applescab,glomerellastainsanddamagecausedbyherbicide)andfiveimportantdisordersinfruits(scab,alternariarot,bull’seyerot,penicilliumrotandcalciumdeficiency–bitterpit).
2. MATeRIAL AND MeTHoDS
Themethodologyconsistsofbuildingadatasetcontaininglabeledexamplesoftentypesofissuescommonlyaffectingappleorchards.Thisdatasetwasrandomlypartitionedintotraining,validationandtestsubsets.ThetrainingandvalidationsubsetswereusedtotrainandoptimizeaConvolutionalNeuralNetworkandaMultilayerPerceptron(asabaseline)techniques.Thetestsetwasthenusedtoassesstheperformanceoftheresultingclassifiersandwasalsopresentedtoexpertsforclassification.
2.1. DatasetThedatasetswerebuiltbyharvestingleavesandfruitsfromthreespeciesofappletrees(Maxigala,Fuji SupremaandPink Lady)andphotographingeachleaforfruitoverawhitebackground.Eachsamplewasthensubjectedtolaboratoryteststoproperlyidentifytheunderlyingdisorderwhichwasthenusedtolabeltheimage.HarvestingoccurredbetweenJanuaryandApril2015fromorchardslocatedinthesouthernpartofBrazil,atEmbrapaUvaeVinho–EstaçãoExperimentaldeFruticulturadeClimaTemperado.
Healthyleavesandfivedisorderswereselectedamongthosecollected,as theyare themostprevalentintheregion.Selectedsymptomsrepresenttwodamagescausedbynutritionalimbalances(deficiencyofpotassiumandmagnesium),twodiseasesdamage(applescab-causedbythefungusVenturia inaequalisandGlomerellastains-causedbythefungusGlomerella cingulata)anddamagecausedbyherbicide(glyphosate).Table1summarizestheleafdataset.Asforthefruits,fivesymptomswerealsoselected,basedastheyarethemostprevalentintheregionandthedamagecausedhaveasuperioreconomyimportance,alongwithhealthyfruits.Thesesymptomsare:onedisordercausedbynutritionalimbalances(calciumdeficiency-bitterpit)andfourdisorderscausedbydiseases(scab,alternariarot,penicilliumrotandbull’seyerot).Table1summarizesthefruitdataset.
The identification of the disorders was conducted by a group of professional agronomistresearchersspecializedinthesesymptomsandwithampleexperienceinplantnutritionandplantpathology.Also,tofurthersupporttheseprofessionals,publishedspecializedbooks(Nachtigalletal., 2004;Valdebenito-Sanhueza et al., 2008)wereused, providing technical informationon thesymptoms.Figures1and2exemplifytheevaluatedsymptomsonleavesandfruits,andtheTable1identifythenumberofsamplescollectedforeachsymptomandhealthysamples.
Aftertheharvestandidentificationoffruitandleaves,acamerawiththeresolutionof12MPwasusedtocapturetheimagesofeachsample.Awhitebackgroundwasusedtophotographeachleaforfruitseparately.Anagronomistfurtheranalyzedtheimages,inordertoverifyifthesymptomsclassificationswerepossible,basedonwhatwaspresentontheimages.Afewimageswhichpresenteddefectsorwereoutsidethecapturestandardsusedwerediscarded.
Inordertoreduceerrorsandproperlyestablishaground-truth,threestrategieswereemployedbytheexpertstoproperlydiagnoseeachissue:
International Journal of Monitoring and Surveillance Technologies ResearchVolume 5 • Issue 2 • April-June 2017
4
1. For symptoms caused by nutritional imbalances, samples of normal leaves and leaves withpotassiumandmagnesiumdeficiencysymptomswereselected,eachconsistingof100leaves.Thesamplesthenweredriedat60◦Celsiusinaforcedaircirculationgreenhouseuntilaconstantweight and forwarded to the laboratory for chemical analyzes in order to quantify the total
Figure 1. Leaves with symptoms of: (A) Potassium deficiency; (B) Magnesium deficiency; (C) Scab damage; (D) Glomerella damage; and (E) Damage caused by herbicide
Figure 2. Fruits with the symptoms of: (F) Scab; (G) Alternaria rot; (H) Bull’s eye rot; (I) Penicilium rot; and (J) Calcium deficiency – bitter pit
International Journal of Monitoring and Surveillance Technologies ResearchVolume 5 • Issue 2 • April-June 2017
5
concentrationsofnutrients(potassiumandmagnesium).Theanalysisresultsshowthatsampleswith symptoms effectively represent the deficiencies of potassium and magnesium. In theanalyticresultsforsamplesofnormalleavesandpotassiumdeficiencyones,itwasprovedthatthechosenleaveswithvisualsymptomsofpotassiumdeficiency(whichoriginatedtheimagesforthissymptominthedataset)containedonly38%ofthepotassiumconcentrationcomparedwithordinaryandhealthyleaves.Theanalyticresultsforsamplesofnormalleavesandmagnesiumdeficientleaves(attwolevelsofseverity),verifiedthatthesampleleaveswithvisualsymptomsofmagnesiumdeficiency(alsooriginatingtheimagesforthissymptominthedataset)contained63%and40%ofthemagnesiumconcentrationinrelationtonormalleaves.Asforthefruits,100samplesofpulpandpeelswerecollected,bothnormalfruits(withoutsymptoms)andfruitswithsymptomsofcalciumdeficiency.Thesampleswerethensenttothelab,forthechemicalanalysis,aimingatthequantificationoftotalcalciumconcentrations.Theanalyticresultsofthesamplesrepresenteffectivelythecalciumdeficiency.Theseresultsofthepulpandpeelsamples,whichpossesscalciumdeficiencysymptoms,demonstratedthatthesesampleshadonly78%and65%ofthecalciumconcentrationfoundinhealthypulpandpeelfruits,respectively;
2. Forsymptomscausedbydiseasedamageonleaves(appletreescab-causedbythefungusVenturia inaequalis-andGlomerella’sstains-causedbythefungusGlomerella cingulata),andonfruits(appletreescab-causedbythefungusVenturia inaequalis,alternariarot–causedbythefungusAlternaria alternata,bull’s-eyerotinducedbyCryptosporiopsis perennansandPeniciliumrotcausedbythefungusPenicillium expansum),sampleswereselectedwithsymptomspreviouslyidentifiedfortheeachofthediseases.Thesesampleswereincubatedformultiplicationofthecausativeagent(fungus)andafterwasperformedtheisolationoffungiandtheircharacterizationandidentificationusingamicroscope,allowingtheproofofcausalagentsandtheirdamageonappletreeleavesandfruits.Theobtainedresultsprovedthatsampleswithdiseasessymptomseffectivelyrepresenttheselecteddiseases;
3. Forsymptomscausedbyherbicidedamage(glyphosate),itwasdecidedtoconductchemicalanalysisinordertoquantifythetotalconcentrationsofnutrientswhichcouldpossiblycauseconfounding of symptoms in cases where the nutrients concentrations were below normal.Thisdecisionwasduethefactthattheanalysisoftheherbicide’sactiveprincipleisdifficulttocharacterize,onceitisrapidlydegradedontheplantafteritsabsorptionandoriginoftoxicity
Table 1. Number of leaves and fruits collected for each class
Issue Number of Leaves Collected Number of Fruits Collected
Potassiumdeficiency 341 -
Magnesiumdeficiency 355 -
Scabdamage 391 -
Glomerellastain 558 -
Herbicidedamage 325 -
Healthyleaves 569 -
Alternariarot - 490
Calciumdeficiency - 434
Penicilliumrot - 702
Bull’seyerot - 486
Scab - 931
Healthyfruits - 1327
International Journal of Monitoring and Surveillance Technologies ResearchVolume 5 • Issue 2 • April-June 2017
6
symptoms.Then,sampleswithherbicidedamagesymptomswereputthroughthesameprotocolsfor nutritional analysis, which results showed these samples did not effectively present anynutritional disorders. Since the samples of leaves with symptoms caused by the herbicideglyphosate(inthreelevelsofseverity)werenotdifferentfromthenutrientconcentrationsofnormal leaves, all nutrients arewithin the range considered as standards for appleorchards(Nachtigalletal.,2004).
2.2. Pre-Processing, Training and evaluationAllimageswerere-sizedtoaresolutionof256x256pixels.Inordertohaveabalanceddataset,werandomlyselected290examples(labeledimages)foreachofthefiveclassescontainingsymptoms.Theresulting1450imagesweredividedintothreesubsets.Atestsubset(hold-outset)wascreatedbyrandomlychoosing15imagesofeachclass.Theremainingexampleswerefurtheredpartitionedinatrainingset(192examples,70%)andvalidationset(83examples,30%).Thisdivisionofsubsetswasduetothefactthatwhencomparingmachinelearningtechniquestoexperts,largetestsubsetswouldbecomeexhaustingforexpertstoclassify,possiblyincreasinghumanerrortotheresults.
Asforthefruitdataset,430examples(labeledimages)wererandomlyselectedforeachofthefiveclassescontainingsymptomsandforthehealthyfruits.Theresulting2580imagesweredividedintothreesubsets.Atestsubset(holdoutset)wascreatedbyrandomlychoosing43(10%)imagesofeachclass.Theremainingexampleswerefurtherpartitionedinatrainingset(271examples,70%)andvalidationset(116examples,30%).Sincetherewasnoclassificationbyexpertsonthefruitdataset,alargerhold-outtestsetwaspossible.
Thetestedlearningalgorithmsweretrainedoverthetrainingsetusingdifferentparametersandconfigurationsandthenappliedtothevalidationset.Thebestperforming(overthevalidationset)parametersofeachalgorithmwerethentrainedovertrainingandvalidationexamplesandappliedtothetestset.Thisprocedurewasadoptedinordertoavoidoverfittingtheresultstothetrainingorvalidationset.Thetestsetwasalsousedbytheexpertsforthemtoprovidetheirclassifications,sothatadirectcomparisonwaspossible.Hence,allresultsreportedinthispaperareforthetestset.
Furthermore,inordertoanalyzethenumberofsamplesneededforasatisfactoryclassification,smallertrainingsubsetswerecreated,randomlyselecting5,10,20,50,100,150,200,and250samplesfromeachclasscontainingsymptoms.Thesetrainingsubsetswerethentestedusingtheprevioustestsubset,withoutanychangesinthenetworkconfigurationafterthevalidationprocesswasfinished.
AfterevaluatingtheCNNapproachbycomparingit’sresultswithMLPnetworksandexperts,275randomlyselectedhealthyleaveswerethenaddedtothetrainsubset,and15randomlyselectedhealthyleaveswereaddedtothetestsubset,inordertoevaluatethenetworkcapacitytodistinguishhealthyleavesfromthefivesymptomschosen.
Theevaluationoftheresultswasdonebycalculatingtheoverallaccuracyofeachclassifierandanalyzingtheresultingconfusionmatrix,alsothemetricsrecall,precisionandkappa(LandisandKoch1977)wereused.
2.3. Convolutional Neural NetworkForthispaper,Caffe(Jiaetal.,2014)andDIGITS(NVIDIA2015)werethetoolsusedtohelpinbuilding,trainingandtestingConvolutionalNeuralNetworks.Multiplearchitecturesweretested,rangingfromshallownetworkswith4layerstodeepnetworksusingsmall(3x3)convolutionfiltersasshownin(SimonyanandZisserman2014).
ThebestresultsoverthevalidationsetwereobtainedbytheAlexNetarchitecture(Krizhevskyetal.,2012).Thisnetworkconsistsoffiveconvolutionallayers,someofwhicharefollowedbymax-poolinglayers,andthreefully-connectedlayersfollowedbysoft-maxanddropoutregularization.Thetrainingparameterswereasfollows:batchsizeof2;asteplearningpolicywasaddedwithagammaof0.2;trainingthroughamaximumof300epochs.
International Journal of Monitoring and Surveillance Technologies ResearchVolume 5 • Issue 2 • April-June 2017
7
Oneoftheissuesfaced,wasthehardwarelimitations,whereinlargedatasetsitiscommontousemultiplehighendGPUcardsforfasteranddeeperlearning(Mohantyetal.,2016),thehardwareusedforthisresearchisthefollowing:GeforceGTX760GPU,Intel(R)core(TM)[email protected],16GBRAM.
2.4. Baseline AlgorithmsInordertoprovideabaselinecomparisontoCNNs,wasappliedMultilayerPerceptronstothedataset,usingthesametrainingmethodologyappliedforCNNs.MultilayerPerceptron(MLP),waschosenforbeingcloselyrelatedtoCNNs.ShallowMLPswereabletoachievehighaccuracyontaskssuchasdigitrecognition(Cho1997),imageclassification(Haraetal.,1994)andfeatureextraction(Rucketal.,1990).
DIGITSwasusedtobuildandtesttheMLPs.Differentshallowarchitecturesweretested.Thebestconfigurationconsistedoftwohiddenlayerswith200and500neurons,respectively,withoneunitforeachclassattheoutputlayerand65536inputunits(256x256pixels).
2.5. Classification by expertsTocollecttheclassificationbyexperts,weasked7volunteerresearchersspecializedinappletreestoclassifytheimagesinthetestset,includingtheagronomistwhocollectedtheleavesforthedatabase,sincehehadagreaterknowledgeonthechosendisorders.Theseexpertsarefurtherspecializedindifferentfieldsofresearch,suchasplantpathologyorplantnutrition.
Eachexpertwasgivenaform(Figure3)tochoose,foreachimage,oneofthefiveclasses.Healthyleaveswerenotshowntotheexperts,becauseinthefieldleavesthatseemhealthyarenotchoseninordertoclassifyadisorder.Eachexpertwasshowntheimagesinasuccessionoverarandompermutation.Theywerenotgiventhecorrectanswersandnotimelimitwasenforced.
Afterstoring theformanswers,avotingsystemwasdevelopedinorder tobestevaluate themachinelearningalgorithmswithexperts.Themotivationtocreatethisvotingsystemwasbecausenotallexpertshaveknowledgeofdifferentresearchareas,andinrealcasescenariosmorethanoneexpertcouldbecalledtodiagnosepossibleleafdisorders.
Inordertocreateafinalanswer,thevotingsystemanalyzedwhichsymptomhadthemostvotesforeachimageintheform.Also,incaseofatiebetweensymptoms,thevotesofexpertswhichhadahigheroverallaccuracywerechosen.
Figure 3. Example of image presented in the form shown to experts
International Journal of Monitoring and Surveillance Technologies ResearchVolume 5 • Issue 2 • April-June 2017
8
3. ReSULTS AND DISCUSSIoN
Figure4presentstheoverallCNNaccuracywhendifferentnumberofimagesperclassareusedfortrainingandtestedagainstthehold-outtestset.Thegraphshowsalogarithmicincreaseonaccuracywhen training set is increased.The curve largely levels off for samples larger than200 images,evidencingthatthenumberofsamplescollectedwasadequate.
Table2showsthefinalCNNconfusionmatrixwhenappliedtothehold-outtestset.Theoverallaccuracywasof97.3%,withonlytwoincorrectclassifications.TheMLP,appliedtothesametestset,resultedinanaccuracyof77.3%.
Table3showstheindividualaccuracyobtainedfromthe7consultedexperts,alsoshowingtheirspecific fieldof research.Wecanobserve thataccuracyvariesconsiderablyacrossexperts.Thebestresult(93.3%)isworsethantheresultobtainedbytheCNN,buttheaverage(71.9%)wasmuchworseandbelowthatobtainedbytheMLP.Table4presentstheconfusionmatrixoftheexpertswhenaggregatedbyvoting,whereitcanbeseenthattheoverallaccuracyimprovessignificantly.
Table5summarizes theresults.Thebestaccuracywasobtainedby theCNN,witha97.3%accuracy,followedbythevotingsystemwith96.0%accuracy,thebestexpertwith93.3%accuracy,andtheMLPnetworkwith77.3%accuracy.Figure5showstheConfidenceinterval(IC1−α(p))fortheclassifiers,usinga99%confidencelevel.ItispossibletoobservethatalltechniquesaremuchbetterthanrandomchoiceandthataggregatedexpertsandtheCNNhavecomparableperformanceandbotharebetterthantheMLP.Asecondexperimentwasconducted,introducinghealthyleavestothedataset.Inthiscase,onlytheCNNwastested.Thesamedistributionof275imagesfortrainingand15imagesfortestingwasused.
NochangesweremadeintheconfigurationoftheCNNnetworkinordertoavoidanoverfittingto the results.Withhealthy leaves in the trainingand testgroups, theCNNwasable toachieveaccuracyof96.67%,asshowninTable6.ItispossibletoobservethatthetrainedCNNisabletoattainperfectaccuracywhendistinguishingbetweenhealthyandunhealthyleaves.Thisisexpected,
Figure 4. Relation between the number of samples used for learning and the accuracy obtained in the symptoms classification. The solid line represents a logarithmic best-fit function over the data points (circles).
International Journal of Monitoring and Surveillance Technologies ResearchVolume 5 • Issue 2 • April-June 2017
9
Table 2. Confusion Matrix resulting from CNN classification on the hold-out set
CNN
Label image
Symptomin leaves Glomerella Herbicide Magnesium
def.Potassium
def. Scab Recall Precision
Glomerella 15 0 0 0 0 100.0% 93.3%
Herbicide 0 15 0 0 0 100.0% 100.0%
Magnesiumdef. 0 0 15 0 0 100.0% 100.0%
Potassiumdef. 0 0 0 14 1 93.3% 100.0%
Scab 1 0 0 0 14 93.3% 93.3%
Accuracy 97.3%
Kappa 0.97
Table 3. Field of research and accuracy obtained by each expert when classifying images in the hold-out test set
Subject Field of Research Accuracy
1 Soil 93.3%
2 Plantpathology 92.0%
3 Postharvest 90.6%
4 Plantpathology 70.6%
5 Plantnutrition 60.0%
6 Cropscience 60.0%
7 Environmentalmanagement 37.3%
Average - 71.9%
Table 4. Confusion Matrix resulting from the aggregation of the classifications provided by human experts
Voting System - Experts
Label image
Symptom in leaves Glomerella Herbicide Magnesium
def.Potassium
def. Scab Recall Precision
Glomerella 15 0 0 0 0 100.0% 100.0%
Herbicide 0 14 1 0 0 93.3% 100.0%
Magnesiumdef. 0 0 14 0 1 93.3% 93.7%
Potassiumdef. 0 0 0 14 1 93.3% 100.0%
Scab 0 0 0 0 15 100.0% 88.2%
Accuracy 96.0%
Kappa 0.95
International Journal of Monitoring and Surveillance Technologies ResearchVolume 5 • Issue 2 • April-June 2017
10
Table 5. Summary of the results of the leaf dataset, ordered by accuracy
Technique Accuracy
CNN 97.3%
MLP 77.3%
Votingsystem 96.0%
HighestAccuracyExpert 93.3%
Figure 5. Confidence intervals with 99% confidence level, with normal approximation for each share of accuracy
Table 6. Confusion Matrix resulting from CNN classifications on a hold-out test set when healthy leaves are added to the dataset
CNN
Label image
Symptom in leaves
Glomerella Herbicide Magnesium def.
Potassium def.
Scab Healthy leaves
Recall Precision
Glomerella 15 0 0 0 0 0 100.0% 93.7%
Herbicide 0 15 0 0 0 0 100.0% 100.0%
Magnesiumdef.
0 0 15 0 0 0 100.0% 93.7%
Potassiumdef.
0 0 0 14 1 0 93.3% 100.0%
Scab 1 0 1 0 13 0 86.6% 92.8%
Healthyleaves
0 0 0 0 0 15 100.0% 100.0%
Accuracy 96.6%
Kappa 0.96
International Journal of Monitoring and Surveillance Technologies ResearchVolume 5 • Issue 2 • April-June 2017
11
asthedisordersalldisplaystrongsymptomsontheleaves,butanadditionalclassificationerrorisnowmadewhendistinguishingbetweendisorders.
OnepossibilitytoimproveclassificationistotrainaCNNtofirstdistinguishbetweenhealthyandunhealthyandthenanothertofurtherdistinguishbetweendisorders.Indeed,aCNNtrainedonabinaryhealthy/unhealthyclassisalsoabletoattain100%accuracy,henceallowingtheuseofthedisorders-onlyCNN.
Table7showstheCNNconfusionmatrixwhenappliedtothefruittrainandhold-outtestset.Theoverallaccuracywasof91.1%.Lowerresultsonfruitclassificationwereexpected,asthefruitimagesweretakenfromdifferentangles,allowingabetterviewofthesymptom,butpossiblymakingthelearningtaskmorechallenging.AsseenonTable6,theappliedCNNonthefruithold-outtestsetwerealsoabletoachieve100%accuracywhendistinguishinghealthyfruits.
4. CoNCLUSIoN AND FUTURe woRK
InthispaperweproposedtheuseConvolutionalNeuralNetworkstoassistonthetaskofidentificationandclassificationofappletreedisordersfromleafimages.Weusedanoveldatasetconsistingofleafimagescontainingfiveknowndisorders,allconfirmedbylaboratorytests,andcomparedtheresultsofaCNNtothatofaMLPandexperts.
OurresultsshowthataCNNbasedontheAlexNetarchitectureisabletosignificantlyoutperformthebaselineMLP,showingcomparableperformancetothatofagroupexpertsandoutperforminganysingleexpert.WhenapplyingtheCNNtothefruitdataset,wewerealsoabletoachievedesirableresults.Moreover, perfect accuracywasobtainedwhenonlydistinguishingbetweenhealthy andunhealthyleavesorhealthyandunhealthyfruits.
Weconclude thatCNNscomposeaviableandusefuloptionfor this task,withmorerobustclassificationsthansinglehumanexperts.Inthissense,anautomatedsystembasedonthetrainedmodelcouldcontributetowardsdiagnosisreliabilityandcostreduction.
Comparedtopreviousworks,ourapproachdoesnotrequirespecializedequipmenttocapturetheimagesoranysortoffeatureextractionorengineering.TheCNNisabletolearnrelevantfeatures
Table 7. Confusion Matrix resulting from CNN classifications on the fruit hold-out test set
CNN - Fruits
Label image
Symptom in fruits
Alternaria rot
Calcium def.
Penicillium rot
Bull’s eye rot
Scab Healthy fruits Recall Precision
Alternariarot 40 0 1 1 1 0 93.0% 87.0%
Calciumdef. 0 43 0 0 0 0 100.0% 89.6%
Pinicilliumrot 4 2 35 2 0 0 81.4% 87.5%
Bull’seyerot 0 3 4 35 0 1 93.3% 100.0%
Scab 2 0 0 1 39 1 90.7% 97.5%
Healthyfruits 0 0 0 0 0 43 100.0% 95.5%
Accuracy 91.1%
Kappa 0.89
International Journal of Monitoring and Surveillance Technologies ResearchVolume 5 • Issue 2 • April-June 2017
12
fromthedata, towhichweattribute the improvedperformance.Thisalsoallowsfor thegeneralapproachtobeusedindifferentdisordersorevencultureswithchangesonlytothedataset.Thisisimportanttoallowfortheautomaticimprovementofthemodelwhenmoredataismadeavailable.
AlthoughtheSVMtechniqueshowedhighaccuraciesintherelatedwork,whenappliedthesamemethodologyastheothertechniques,usingnopre-processingorfeatureextraction,theresultsdidnotachievemorethan60%accuracies,thereforetheywerenotincludedinthisarticle.
Severallinesoffutureworkarebeingplanned.Weareexpandingthecurrentdatasettomakeavailablemorediverseexamples.Whilewehaveshownthatmoreexamplesobtainedinthesamewaywillonlyprovidemarginalimprovements,theintroductionofmorediversity(e.g.differentbackgroundsandlightconditions)couldallowforbetterperformance.Wearealsointroducingadditionaldisordersandculturestotesthowtheapproachscaleswiththesesettings.
Differentarchitecturesarealsobeingconsidered.Webelievethatacombinationofmoreexamplesandimprovedarchitecturecouldleadtoasystemthatcanconsistentlyoutperformexperts.Finally,weaimatintegratingourmethodologyintoworkingsystemsthatcanbeusedonthefield,inlesscontrolledconditions.
International Journal of Monitoring and Surveillance Technologies ResearchVolume 5 • Issue 2 • April-June 2017
13
ReFeReNCeS
Al-Hiary,H.A.,Ahmad,S.B.,Reyalat,M.,Braik,M.,&Alrahamneh,Z.(2011).FastandAccurateDetectionand Classification of Plant Diseases. International Journal of Computers and Applications, 17(1), 31–38.doi:10.5120/2183-2754
Cho,S.(1997).Neural-networkclassifiersforrecognizingtotallyunconstrainedhandwrittennumerals.IEEE Transactions on Neural Networks,8(1),43–53.doi:10.1109/72.554190PMID:18255609
Fialho,F.,Garrido,L.,Botton,M.,deMelo,G.,Fajardo,T.,&Naves,R.(2012).Diagnosticodedoençasepragasnaculturadavideirausandoosistemaespecialistauzum.RetrievedFebruary02,2016,fromhttp://ainfo.cnptia.embrapa.br/digital/bitstream/item/73877/1/cot128.pdf
Hara,Y.,Atkins,R.,Yueh,S.,Shin,R.,&Kong,J. (1994).Applicationofneuralnetworks toradar imageclassification.IEEE Transactions on Geoscience and Remote Sensing,32(1),100–109.doi:10.1109/36.285193
Jia,Y.,Shelhamer,E.,Donahue,J.,Karayev,S.,Long,J.,Girshick,R.,&Darrell,T.(2014).Caffe.InProceedings of the ACM International Conference on Multimedia - MM ‘14.
Kovaleski,A.(2004).Pragas.InKOVALESKI, A. Maçã: fitossanidade(pp.11–33).BentoGonçalves:EmbrapaUvaeVinho.
Krizhevsky,A.,Sutskever, I.,&Hinton,G. (2012). Imagenet classificationwithdeepconvolutionalneuralnetworks.InAdvances in Neural Information Processing Systems(pp.1097–1105).
Landis,J.R.,&Koch,G.G.(1977).TheMeasurementofObserverAgreementforCategoricalData.Biometrics,33(1),159–174.doi:10.2307/2529310PMID:843571
Lawrence,S.,Giles,C.L.,Tsoi,A.C.,&Back,A.D.(1997).Facerecognition:Aconvolutionalneural-networkapproach.IEEE Transactions on NeuralNetworks,l(8),98–113.
Lecun,Y.,Bottou,L.,Bengio,Y.,&Haffner,P.(1998).Gradient-basedlearningappliedtodocumentrecognition.InProceedings of the IEEE(pp.2278–2324).
Mohanty,S.P.,Hughes,D.,&Salathe,M.(2016).Inferenceofplantdiseasesfromleafimagesthroughdeeplearning.arXiv:1604.03169
Nachtigall,G.R.,Basso,C.,&Freire,C.J.S.(2004).Nutriçãoeadubaçãodepomares.InG.R.Nachtigall,(Ed.).Maçã:produção.Brasília,EmbrapaInformaçãoTecnológica(pp.63-77).FrutasdoBrasil.
Nachtigall,L.G.,Araujo,R.M.,&Nachtigall,G.R.(2016).ClassificationofAppleTreeDisordersUsingConvolutionalNeuralNetworks.InProceedings of the 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI).
NVIDIA.(2015).Nvidiadigits–interactivedeeplearningGPUtrainingsystem.RetrievedFebruary02,2016,fromhttps://developer.nvidia.com/digits
Revathi,P.,&Hemalatha,M.(2014).Identification of cotton diseases based on cross information gain deep forward neural network classifier with PSO feature selection. International Journal of Engineering and Technology (IJET).
Ruck,D.,Rogers,S.,&Kabrisky,M.(1990).Featureselectionusingamultilayerperceptron.Journal of Neural Network Computing,2(2),40–48.
Rumpf,T.,Mahlein,A.,Steiner,U.,Oerke,E.,Dehne,H.,&Plümer,L.(2010).EarlydetectionandclassificationofplantdiseaseswithSupportVectorMachinesbasedonhyperspectralreflectance.Computers and Electronics in Agriculture,74(1),91–99.doi:10.1016/j.compag.2010.06.009
Sannakki,S.S.,Rajpurohit,V.S.,Nargund,V.B.,&Kulkarni,P.(2013).Diagnosisandclassificationofgrapeleafdiseasesusingneuralnetworks. InProceedings of theFourth International Conference on Computing, Communications and Networking Technologies (ICCCNT),1-5.
Simonyan,K.,&Zisserman,A.(2015).VeryDeepConvolutionalNetworksforLarge-ScaleImageRecognition.RetrievedFebruary02,2016,fromhttps://arxiv.org/abs/1409.1556
International Journal of Monitoring and Surveillance Technologies ResearchVolume 5 • Issue 2 • April-June 2017
14
Ricardo Matsumura Araujo is a Computer Science Professor at the Center for Technological Advancement in the Federal University of Pelotas (UFPel). He holds a PhD and Master Degrees in Computer Science from UFRGS and lectures at the Computer Science, Computer Engineering undergrad courses and the Graduation Program in Computer Science. His research interests include machine learning, computational social science and data science.
Gilmar Ribeiro Nachtigall holds the title of PhD in Agronomy. Gilmar currently works as a researcher in Embrapa Wine & Grapes.
Valdebenito-Sanhueza,R.M.,Nachtigall,G.R.,Kovaleski,A.,dosSantos,R.S.S.,&Spolti,P.(2008).Manual de identificação e controle de doenças, pragas e desequilíbrio nutricional da macieira (pp.45–58).BentoGonçalves:EmbrapaUvaeVinho.
Yu,K.,Xu,W.,&Gong,Y.(2009).AdvancesinNeuralInformationProcessingSystems:Vol.1889-1896.Deep learning with kernel regularization for visual recognition.