Research Article An Active Learning Classifier for Further ...downloads.hindawi.com/journals/cmmm/2016/4345936.pdf · three major parts: image processing [], ... to build an automatic

Research ArticleAn Active Learning Classifier for Further Reducing DiabeticRetinopathy Screening System Cost

Yinan Zhang12 and Mingqiang An3

1School of Computer Science and Technology Beijing Institute of Technology Beijing 100081 China2College of Computer Science and Information Engineering Tianjin University of Science and Technology Tianjin 300222 China3College of Science Tianjin University of Science and Technology Tianjin 300222 China

Correspondence should be addressed to Yinan Zhang zhyn163com

Received 7 May 2016 Revised 24 June 2016 Accepted 26 July 2016

Academic Editor Georgy Gimelrsquofarb

Copyright copy 2016 Y Zhang and M AnThis is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Diabetic retinopathy (DR) screening system raises a financial problem For further reducing DR screening cost an active learningclassifier is proposed in this paperOur approach identifies retinal images based on features extracted by anatomical part recognitionand lesion detection algorithms Kernel extreme learning machine (KELM) is a rapid classifier for solving classification problemsin high dimensional space Both active learning and ensemble technique elevate performance of KELM when using small trainingdatasetThe committee only proposes necessarymanual work to doctor for saving cost On the publicly availableMessidor databaseour classifier is trained with 20ndash35 of labeled retinal images and comparative classifiers are trained with 80 of labeled retinalimages Results show that our classifier can achieve better classification accuracy than Classification and Regression Tree radialbasis function SVM Multilayer Perceptron SVM Linear SVM and 119870 Nearest Neighbor Empirical experiments suggest that ouractive learning classifier is efficient for further reducing DR screening cost

1 Introduction

Diabetic retinopathy (DR) [1] is one of the most commoncauses of blindness in diabetic mellitus research [2] Millionsof diabetic patients suffer from DR DR not only deprivespatientsrsquo sight [3] but also brings heavy burden to theirfamily and society [4] In 2012 [5] 291 million Americans(93 of the population) were diagnosed with diabetes Amore serious problem is that 76 of those patients werebecoming with worsening diabetes Each year approximately14 million Americans are diagnosed with diabetes With thedevelopment of diabetes about 40 of patients may losesight from DR [6] Recently new technique named opticalcoherence tomography (OCT) is popular in developed coun-tries OCT can perform cross-sectional imaging but OCTis still too expensive for many areas which are economicallyunderdeveloped Thus DR screening system is still useful fordiabetic patients in many low income areas This challengingproblem causes a demand of a better computer-aided DRscreening system [7 8]

Many computer-aided screening systems can reducemas-sive manual screening effectively [9 10] Gardner et al [11]propose an automatic DR screening system with artificialneural network Most of computer-aided DR screeningresearches focus on reducing and improving doctorrsquos work Itis noteworthy that Liew et al [12] point out a critical issue thisissue is about accuracy and cost effectiveness A typical DRscreening hardware system includes but is not limited to highresolution camera computing system and storage systemThe software system forDR screening systemmainly containsthree major parts image processing [13] feature extraction[14] and classification [15] (automatic diagnosis result ofcomputer)The architecture of computer-aidedDR screeninghardware system is clear and stable nowadays but softwaresystem still has much space for development Classificationis an important breakthrough for improving DR screeningsystem especially when applying active learning methodrather than supervised learning or unsupervised learningmethod

Hindawi Publishing CorporationComputational and Mathematical Methods in MedicineVolume 2016 Article ID 4345936 10 pageshttpdxdoiorg10115520164345936

2 Computational and Mathematical Methods in Medicine

(a) DR grade I (b) DR grade II (c) DR grade III

Figure 1 Representative images having different grades (I II and III)

(a) DR grade IV (b) DR grade V (c) DR grade VI

Figure 2 Representative images having different grades (IV V and VI)

However to build an automatic computer-aided screen-ing system raised a financial problem [16] A DR screeningsystem faces three major requirements nowadays Firstwhen a company builds a DR screening system for medicalpurpose the accuracy is a keymeasurement Second hospitaladministrators need that this DR system not only can makeclassification automatically but also can save more moneyand time when it is running in the future Third the DRscreening system should raise meaningful queries to doctorsas many as possible and cases that can be easily diagnosed bycomputer should be queried as little as possible Therefore aDR screening system should further have the following threecharacters (1) more accuracy (2) smaller training datasetand (3) active learning

For solving the above problems we propose an ensemble-kernel extreme learningmachine (KELM) based active learn-ing with querying by committee classifier Below are themajor contributionsconclusions of our work

(1) Retinal image is easy to snap but manually diagnos-ing a result is of high cost

(2) Kernel technique is suitable for classifying retinalimages which is related to classification in highdimensional spaces

(3) Ensemble learning (bagging technique) can ele-vate classifierrsquos performance Particularly overfittingoccurs when training set is small

(4) Active learning can further reduce the size of trainingdataset compared to traditional machine learningmethod in DR screening system

(5) The committee can avoid unnecessary queries todoctor this is distinctive to other state-of-the-art DRscreening systems

This paper is organized as follows Section 2 showsbackground of retinal images and related works Section 3presents the details of the proposed classifier and Section 4presents empirical experiment and results Conclusions aredrawn in the final section

2 Retinal Images and Related Works

21 Retinal Image and Detections Figure 1 shows DR grade[17] I II and III Figure 2 shows DR grade IV V and VIMicroaneurysm appears as tiny red dots in Figure 1 withthe worsening of diabetes exudates occur as primary signsof diabetic retinopathy In Figures 1 and 2 inhomogeneityappears and it can lead to loss of sight

Doctors give diagnosis results based on 3 major lesionsmicroaneurysm exudates and inhomogeneity Moreoverthere are two useful anatomical detections macula and opticdisc In Table 1 five essential detections of DR screening arelisted

22 Classic DR Screening System Architecture DR screeningsystem [19] captures retinal images and gives diagnosisresults The classic architecture of DR screening system isshown in Figure 3

A high resolution camera is used for capturing retinalimages Then retinal images are saved into storage systemUsually there is a preprocessing for retinal images this

Computational and Mathematical Methods in Medicine 3

Light

Camera Retinal images

Computing system and storage system

Featureextraction

Classifier

Diagnosis result

Classificationplane

Negativeside = minus1

Positiveside = 1

minus1

Figure 3 DR screening system architecture

process enhances image contrast and so forth In the nextstepmultiple features of retinal images are extracted by imagealgorithms Extracted features are represented in high dimen-sional space Therefore original retinal image is mapped intohigh dimensional space One retinal image is presented as avector (or a dot) in this high dimensional space Finally atrained classifier gives a binary result (minus11 or 10)This binaryresult indicates that the vector belongs to the ldquopositiverdquo side orthe ldquonegativerdquo side Figure 3 also exemplifies a brief workflowin two-dimensional space

Many DR screening system studies focus on the per-formance of accuracy measurement Dabramoff et al [20]pointed out that DR screening system is an investigated fieldFleming et al [21] showed that reducingmassmanual effort isthe key of creating DR screening system Meanwhile severalresearchers focus on automatic diagnosis of patients havingDR [22] Even though those researches and applications savemassive manual work DR screening system cost can befurther reduced

3 Ensemble Extreme Learning MachineBased Active Learning Classifier withQuery by Committee

In this section the proposed classifier is described indetail With the consideration of accuracy time consumingcomputing resource consuming high dimensional featuresclassification and reducing artificial labeling we adapt kernelextreme learningmachine (KELM) and thenwe use ensemblelearning (bagging technique) to solve overfitting problemMoreover the bagging-KELM can be trained in parallelcomputing architecture

31 Active Learning with Query by Committee Active learn-ing [23] has control over instances once active learningreaches query paradigm in which the committee can assignnew artificial labeling task for human Query by committee(QBC) [24] is a learning method which adopts decisionof a committee to decide an unlabeled instance should beasked for artificial labeling or not Once an artificial labeling

Table 1 Essential detection of DR screening

Detection target Detail information

Microaneurysm An extremely small aneurysm it looks astiny red dots in retinal image

ExudatesFat or lipid leak from aneurysms or bloodvessels it looks as small and bright spotswith irregular shape

Inhomogeneity Regions of retina are different and unusual

MaculaThe macula is an oval-shaped pigmentedarea near the center of the retina of thehuman eye

Optic disc The optic disc is the point of exit forganglion cell axons leaving the eye

task is finished the new artificial labeled instance is addedinto training set Therefore the committee reduces testinginstances and enlarges training set with asking for artificiallabeling work Since QBC has control over instances fromwhich it learns QBC maintains a group of hypotheses fromtraining set those hypotheses represent the version spaceFor real word problems the size of committee should be bigenough

Figure 4 shows the proposed method Our approachcontains 3 cyclic steps

Step 1 KELM with bagging technique and committee aretrained synchronously Initial training instances consist ofextracted features from DR images and corresponding arti-ficial label marks

Step 2 After the training procedure the committee canpropose necessary queries for bagging-KELM

Step 3 In the testing procedure both bagging-KELMand thecommittee receive testing instances and then bagging-KELMasks permission from the committee If committee agreeswith bagging-KELM bagging-KELM gives a hypothesis foran unlabeled instance as final diagnosis result Howeverif committee gives disagreement the committee proposes


Kernel extreme learning machine 1

Kernel extreme learning machine N

Bagging technique

Committee

Testinginstances

(decreasingly)

Traininginstances

(increasingly)

Artificial labeling

ActionData

QueryArtificial labeled instances

Traininginstances

Figure 4 Active learning with query with committee

the unlabeled retinal image to human doctor (this increasestraining instances)

To conclude our approach is dealing with 3 optimizationproblems (1) increasing training dataset as little as possible(2) increasing training dataset with necessary queries and (3)decreasing testing dataset with control

32 Kernel Extreme Learning Machine Extreme learningmachine (ELM) [25] is a fast and accurate single-forwardlayer feedforward neural network classification algorithmproposed by Huang et al Different from traditional neuralnetworks ELM assigns perceptron with random weights inthe input layer and then the weights of output layer can becalculated catalytically by finding the least square solutionTherefore ELM is faster than other learning algorithms forneural network the time cost is extremely low

For diabetic retinopathy screening given a trainingdataset 119883 with 119873 labeled instances (119909

119894 119905119894) 119894 = 1 2

119873 where each 119909119894is an 119899 dimensional vector 119909

119894=

[1199091 1199092 1199093 119909

119899]119879

isin 119877119899 and 119905

119894is an indicating label

of corresponding instance the output of signal-layer forwardnetwork with 119872 perceptrons in middle layer can be calcu-lated as follows

119872

sum

119894=1

120573119894119892119894(119909119895) =

119872

sum

119894=1

120573119894119892 (119908119894+ 119909119895+ 119887119894) = 119905119895

119895 = 1 2 119873

(1)

where119908119894is the weights connecting the 119894th middle perceptron

with the input perceptron120573119894is the weights connecting the 119894th

hidden perceptron with the output perceptron and 119887119894is the

bias of the 119894th hidden perceptron119892(sdot) denotes nonlinear activation function some classical

activation functions are listed as follows

(1) Sigmoid function

119866 (119886 119887 119909) =1

1 + exp (minus (119886 sdot 119909 + 119887)) (2)

(2) Fourier function

119866 (119886 119887 119909) = sin (119886 sdot 119909 + 119887) (3)

(3) Hard limit function

119866 (119886 119887 119909) =

1 if 119886 sdot 119909 minus 119887 ge 0

0 otherwise(4)

(4) Gaussian function

119866 (119886 119887 119909) = exp (minus119887 119909 minus 1198862) (5)

(5) Multiquadrics function

119866 (119886 119887 119909) = (119909 minus 1198862+ 1198872)12

(6)

Equation (1) can be expressed in a compact equation asfollows

119867120573 = 119879 (7)

where119867 is the middle layer output matrix

119867 =

[[[[

[

119892 (11990811199091+ 1198871) sdot sdot sdot 119892 (119908

1198731199091+ 119887119873)

d

119892 (1199081119909119872+ 1198871) sdot sdot sdot 119892 (119908

119873119909119872+ 119887119873)

]]]]

]

(8)

where 120573 is the matrix of middle-to-output weights and 119879 isthe target matrix


In (8) weights 119908119894and bias 119887

119894are assigned random float

number and 119892(sdot) is selected as sigmoid function thereforethe output of middle perceptron can be determined very fastwhich is119867 in (7)

The remainingwork isminimum square error estimation

min120573

1003817100381710038171003817119867120573 minus 1198791003817100381710038171003817 (9)

The smallest norm least squares solution for (9) can becalculated by applying the definition of the Moore-Penrosegeneralized inverse the solution is as follows

= 119867minus1119879 (10)

where119867minus1 is the generalized inverse of matrix119867The least squares solution of (10) based on Kuhn-Tucker

conditions can be written as follows

120573 = 119867119879(1

119862+ 119867119867

119879)

minus1

119879 (11)

where119867 is themiddle layer output119862 is regulation coefficientand 119879 is the expected output matrix of instances

Therefore the output function is

119891 (119909) = ℎ (119909)119867119879(1

119862+ 119867119867

119879)

minus1

119879 (12)

The kernel matrix of ELM can be defined as follows

119872 = 119867119867119879 119898119894119895= ℎ (119909

119894) ℎ (119909119895) = 119896 (119909

119894 119909119895) (13)

Therefore the output function 119891(119909) of kernel extreme learn-ing machine can be expressed as follows

119891 (119909) = [119896 (119909 1199091) 119896 (119909 119909119873)] (1

119862+119872)

minus1

119879 (14)

where 119872 = 119867119867119879 and 119896(119909 119910) is the kernel function of

perceptrons in middle layerWe adopt three kernel functions in this paper they are as

follows

POLY for some positive integer 119889

119896 (119909 119910) = (1 + ⟨119909 119910⟩)119889 (15)

RBF for some positive number 119886

119896 (119909 119910) = exp(minus⟨(119909 minus 119910) (119909 minus 119910)⟩

(21198862)) (16)

MLP for a positive number 119901 and negative number 119902

119896 (119909 119910) = tanh (119901 ⟨119909 119910⟩ + 119902) (17)

ComparedwithELMKELMperforms similarly to or bet-ter than ELM and KELM is more stable [25] Compared withSVM KELM spends much less time without performanceloses

33 Bagging Technique By applying ensemble learning [26]to our approach classifier can obtain better classificationperformance when dealing with overfitting problem broughtby small training set We apply bagging technique to enhanceKELM classifier Bagging technique seeks to promote diver-sity among the methods it combines In the initializationprocedure we adopt multiple different kernel functions anddifferent parameters Therefore a group of classifiers can bebuilt for bagging technique implementation

When applying a group of KELMs with bagging methodeach KELM is trained independently and then those KELMsare aggregated via a majority voting technique Given atraining set TR = (119909

119894 119910119894) | 119894 = 1 2 119899 where 119909

119894

is extracted features from retinal images and 119910119894is corre-

sponding diagnosis result we then build119872 training datasetsrandomly to construct119872 KELMs bagging independently

The bootstrap technique is as follows

init

given training datasets TR119872 distinctive KELM

119894 119894 = 1 2 3 119872

training

construct subtraining datasets STR119894| 119894 =

1 2 119872 formTRwith resampling randomlyand replacementtrain KELM

119894with STR

119894

classification

calculate hypothesis 119867119894of ELM

119894 119894 = 1 2 3

119872

perform majority voting of119867119894

4 Experiments and Results

41MessidorDatabase andEvaluationCriteria For empiricalexperiment we use public Messidor dataset [27] that consistsof 1151 instances Images are of 45-degree field of view andthree different resolutions (440lowast960 2240lowast1488 and 2304lowast1536)

Each image is labeled 0 or 1 (negative or positive diagnos-tic result) 540 images are labeled 0 the remnants are labeled1 Many researches did 5-fold (or 10-fold) cross-validationThus 80 of database is training dataset and the remaining20 instances are testing dataset We train Classification andRegression Tree (CART) radial basis function (RBF) SVMMultilayer Perceptron (MLP) SVM Linear (Lin) SVM and119870 Nearest Neighbor (KNN) with 80 of database and theremaining 20 is as testing dataset

For the proposed active learning (AL) classifier we use10ndash20 of database as initial training dataset and give it10ndash15 of database as queries made by committee There-fore 20ndash35 of database is used to train active learningclassifier in total and the remaining 65ndash80 of databaseis testing dataset We also train ELM and KELM with 20ndash35 of database and the remaining 65ndash80 of database is


testing dataset Therefore ELM KELM and our approachare trained with the same amount of labeled instancesthe results can prove the availability of kernel techniquebagging technique and active learning Committee containsall classifiers which were mentioned in this paper

In short we use 80 of Messidor database to train 5classifiers and we cut more than half the training instancesto validate ELM KELM and our approach Details arepresented in Section 43 Each classifier was tested 10 timesThe recommendations of the British Diabetic Association(BDA) are 80 sensitivity and 95 specificity Thereforeaccuracy sensitivity and specificity are compared amongthose classifiers

Sensitivity accuracy and specificity are defined as fol-lows

Sensitivity = TPTP + FN

Accuracy = TP + TNTP + FP + TN + FN

Specificity = TNFP + TN

(18)

where TP FP TN and FN are the true and false positive andtrue and false negative classifications of a classifier

42 Retinal Image Features Table 2 lists every featureextracted from retinal images with feature information Theretinal images are mapped in a 19-dimensional space

The details of image features used in Messidor databaseare listed as follows

(1) Quality Assessment Messidor database contains suffi-cient quality image for a reliable diagnosis result Afterdetecting vessel system the box count values can becalculated for a supervised learning classifier Thevessel segmentation algorithm is based on HiddenMarkov Random Fields (HMRF) [28]

(2) Prescreening Images are classified as abnormal or tobe needed for further processing Every image is splitinto disjoint subregions and inhomogeneity measure[29] is extracted for each subregion Then a classifierlearns from these features and classifies the images

(3) MA Detection Microaneurysms appear as small reddots and they are hard to find efficiently The MAdetection method used in Messidor database is basedon preprocessing method and candidate extractorensembles [30]

(4) Exudate Exudates are bright small dots with irregularshape By following the likely complex methodologyas for microaneurysm detection [30] it combinespreprocessing methods and candidate extractors forexudate detection [31]

(5) Macula Detection Macula is located in the center ofthe retina By extracting the largest object from imagewith brighter surroundings [32] the macula can bedetected effectively

Table 2 Image features of Messidor dataset [18]

Feature Feature information

(0) The binary result of quality assessment 0 badquality 1 sufficient quality

(1)The binary result of prescreening where 1indicates severe retinal abnormality and 0 itslack

(2ndash7)The results of MA detection Each feature valuestands for the number of MAs found at theconfidence levels alpha = 05 sdot sdot sdot 1 respectively

(8ndash15)

Contain the same information as (2ndash7) forexudates However as exudates are representedby a set of points rather than the number ofpixels constructing the lesions these featuresare normalized by dividing the number oflesions with the diameter of the ROI tocompensate different image sizes

(16)

The Euclidean distance of the center of themacula and the center of the optic disc toprovide important information regarding thepatientrsquos condition This feature is alsonormalized with the diameter of the ROI

(17) The diameter of the optic disc

(18) The binary result of the AMFM-basedclassification

(19)Class label 1 containing signs of DR(accumulative label for the Messidor classes 12 and 3) 0 no signs of DR

(6) Optic Disc Detection Optic disc is anatomical struc-ture with circular shape Ensemble-based system ofQureshi et al [33] is used for optic disc detection

(7) AMFM-Based Classification The Amplitude-Modulation Frequency-Modulation method [34]decomposes the green channel of the image and thensignal processing techniques are applied to obtainrepresentations which reflect the texture geometryand intensity of the structures

43 Experiment Results CART RBF MLP Lin KNN ALKELM and ELM are compared in experiments For CARTRBF MLP Lin and KNN 80 of labeled retinal imagesare offered to these 5 classifiers as training dataset Theparameters of those 5 classifiers are determined by grid searchon training dataset To AL KELM and ELM are also used asgrid-searchmethod to set hidden layerTheMATLABR2015aversion is used in this paper

Figure 5 shows the boxplot of normalized correct classifi-cations In Figure 5(a) AL has 115 instances (10 of Messidordatabase) for initial training and 115 more instances (10 ofMessidor database) are queries from committee Therefore230 (20 ofMessidor database) instances are used in trainingAL in total For testing kernel bagging technique and activelearning KELM and ELM are offered 230 training instances(20 of Messidor database) Other 5 classifiers are trainedwith 920 instances (80 of Messidor database)


CART RBF MLP Lin KNN AL KELM ELM

065

075

085

06

07

08

(a)CART RBF MLP Lin KNN AL KELM ELM

065

075

085

06

07

08

(b)

Figure 5 (a) Using 10 of Messidor dataset as initial training dataset and the committee proposes 10 of dataset as queries and (b) using10 of Messidor dataset as initial training dataset and the committee proposes 15 of dataset as queries


065

075

085

06

07

08


065

075

085

06

07

08

(b)


KELM is classified more accurately than ELM by the ker-nel technique Bagging technique and active learningmethodfurther elevate classification accuracy of KELM ComparingAL with other 7 classifiers its correct classification is about2sim20 higher than other classifiers in Figure 5(a) and thetraining dataset of AL is only 25 of other 5 classifiersMLP CART and KNN are the worst three classifiers RBFperforms a little better than ELM but RBF has three timesmore labeled instances than ELM Lin performs better thanELM and KELM but it is slightly lower than AL

Similarly in Figure 5(b) active learning and other clas-sifiers have been tested again KELM gives more correctclassification results and AL is better than both ELM andKELM Comparing AL with other classifiers AL achievesbetter classification accuracy and AL only needs 287 labeledinstances for training

In Figure 6 CART RBF MLP Lin and KNN are exactlythe same as in Figure 5 In Figure 6(a) a bigger initialtraining dataset (15) is used to train AL and 25 labeledinstances are given to KELM and ELM as training dataset InFigure 6(b) 30 labeled instances are given to KELM andELM for training In Figure 6 kernel technique helps ELMto produce more correct classification results and the activelearning method still further boosts KELM In Figure 6 Linperforms closely to AL but Lin has nearly triple the size oftraining datasetTherefore the disadvantage of Lin is the needof massive manual work

Figure 7 shows 20 of labeled instances as initial trainingdataset for AL Figure 7 proves the same conclusion as shownin Figures 5-6 kernel technique bagging technique andactive learning are effective and efficient for improving ELMIt should be noticed that KELM and ELM are tested twice


CART RBF MLP Lin KNN AL KELM ELM06

065

07

075

08

085

(a)

06

07

08


065

075

085

(b)


Table 3 Details of Figures 5ndash7 (accuracy)

Classifiers Max Min Mean

Figures 5ndash7

CART 80 0775 0693 0725RBF 80 0805 0728 0770MLP 80 0718 0623 0669Lin 80 0856 0741 0818KNN 80 0758 0646 0720

Figure 5(a)AL 10 10 0872 0799 0838KELM 20 0832 0785 0808ELM 20 0804 0725 0783

Figure 5(b)AL 10 15 0880 0807 0846KELM 25 0803 0776 0790ELM 25 0799 0705 0761





Max of column 0886 0812 0851

which Figures 5(b) and 6(a) present Similarly Figures 6(b)and 7(a) also present twice the comparison results of ELMand KELM Table 4 lists results of Figures 5ndash7 in detail

Table 3 contains 5 columns the names of classifiers areattached with experiment parameters For instance KNN 80is that KNN classifier is trained with 80 of instances The

Table 4 Sensitivity and specificity

Classifiers Sensitivity mean Specificity meanCART 80 7764 8310RBF 80 7807 8617MLP 80 7441 8452Lin 80 8029 8896KNN 80 7713 8813AL 10 10 8169 9146KELM 20 7944 9081ELM 20 7892 9003AL 10 15 8238 9154KELM 25 7843 9072ELM 25 7780 9026AL 15 10 8267 9211KELM 25 8054 9123ELM 25 7987 9078AL 15 15 8263 9208KELM 30 8188 9191ELM 30 8135 9035AL 20 10 8278 9158KELM 30 8183 9003ELM 30 8110 8961AL 20 15 8263 9172KELM 35 8195 9018ELM 35 8121 8896

max min and mean are calculated from 10 runs AL 10 15 isthat 10 of labeled instances are as initial training dataset and15 of labeled instances are queries form committee

In Table 3 the lower limit and mean value of AL 10 10are the highest in columnThe upper limit of AL 20 10 is thehighest in column

In Table 4 mean values of sensitivity and specificityare listed for all classifiers The first column of Table 4 is


corresponding experiment as Table 3 Second column ismeanvalues of sensitivity and third column is mean values ofspecificity All mean values are statistical result of 10 runsSensitivity mean values are between 074 and 082 specificitymean values are between 083 and 092

44 Discussions In this section we present two issues aboutexperiment results (1) what are the advantages of KELM (2)Is the proposed method suitable for medical implement

TheKELM is ELMwith kernel technique this approach issimilar to SVM The kernel technique can map original data(linear inseparable) into a new space (higher dimensionalspace but linear separable) for a linear classifier The majorcontribution of KELM is that kernel technique helps ELM toface a high dimensional classification problem which is fasterthan kernel-SVMwhen solving the same problem Especiallyin this paper the Messidor dataset contains 18 features allclassifiers must give a hypothesis in 18-dimensional space

The proposedmethod is suitable for implementationTherecommendations of the British Diabetic Association (BDA)[35] are 80 sensitivity and 95 specificityThe test results ofour method are close to those two standards

5 Conclusion

In this paper an active learning classifier is presented forfurther reducing diabetic retinopathy screening system costClassic researches did 5- or 10-fold cross-validation whichimplies that massive diagnosis results should be preparedbeforehand Unlike other state-of-the-art methods we focuson further reducing cost We use kernel extreme learningmachine to deal with classification problem in high dimen-sional space For solving overfitting problembrought by smalltraining set we adapt ensemble learning method By usingactive learning with QBC the ensemble-KELM learns frommanual diagnosis result by necessary queries

Our approach and other comparative classifiers had beenvalidated on public diabetic retinopathy dataset Kernel tech-nique and bagging technique are also tested and analyzedEmpirical experiment shows that our approach can classifyunlabeled retinal images with higher accuracies than othercomparative classifiers but the size of training dataset ismuch smaller than other comparative classifiers With theconsideration of implementation the performance of ourapproach is close to the recommendations of the BritishDiabetic Association

Competing Interests

The authors declare that they have no competing interests

References

[1] A N Kollias and M W Ulbig ldquoDiabetic retinopathy earlydiagnosis and effective treatmentrdquoDeutsches Arzteblatt Interna-tional vol 107 no 5 pp 75ndash84 2010

[2] K Venkatnarayan J Pboyle and T Jthompson ldquoLifetimerisk for diabetes mellitus in the United Statesrdquo Journal of the

American Medical Association vol 290 no 14 pp 1884ndash18902003

[3] MMemon S Memon and N Bakhtnizamani ldquoSight threaten-ing diabetic retinopathy in typemdash2 diabetes mellitusrdquo PakistanJournal of Ophthalmology vol 30 no 1 pp 1ndash9 2014

[4] V R Driver M Fabbi L A Lavery and G Gibbons ldquoThe costsof diabetic foot the economic case for the limb salvage teamrdquoJournal of Vascular Surgery vol 52 no 3 pp 17Sndash22S 2010

[5] R Li S Sshrestha and R Dlipman ldquoDiabetes self-managementeducation and training among privately insured persons withnewly diagnosed diabetesmdashUnited States 2011-2012rdquoMorbidityandMortalityWeekly Report vol 63 no 46 pp 1045ndash1049 2014

[6] W C Chan L T Lim M J Quinn F A Knox D McCanceandRM Best ldquoManagement and outcome of sight-threateningdiabetic retinopathy in pregnancyrdquo Eye vol 18 no 8 pp 826ndash832 2004

[7] J Cuadros and G Bresnick ldquoEyePACS an adaptabletelemedicine system for diabetic retinopathy screeningrdquoJournal of Diabetes Science and Technology vol 3 no 3 pp509ndash516 2009

[8] G G Gardner D Keating T H Williamson and A TElliott ldquoAutomatic detection of diabetic retinopathy using anartificial neural network a screening toolrdquo British Journal ofOphthalmology vol 80 no 11 pp 940ndash944 1996

[9] P Hscanlon C Pwilkinson and S Jaldington Screening forDiabetic Retinopathy 2009

[10] C Sinthanayothin V Kongbunkiat and S Phoojaruen-chanachai ldquoAutomated screening system for diabetic retinopa-thyrdquo in Proceedings of the 3rd International Symposium on Imageand Signal Processing andAnalysis (ISPA rsquo03) vol 2 pp 915ndash920September 2003

[11] G G Gardner D Keating T H Williamson and A TElliott ldquoAutomatic detection of diabetic retinopathy using anartificial neural network a screening toolrdquo The British Journalof Ophthalmology vol 80 no 11 pp 940ndash944 1996

[12] G Liew C A Egan A Rudnicka et al ldquoEvaluation of auto-mated software grading of diabetic retinopathy and comparisonwithmanual image gradingmdashan accuracy and cost effectivenessstudyrdquo Investigative Ophthalmology ampVisual Science vol 55 no13 p 2293 2014

[13] J Canny ldquoA computational approach to edge detectionrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol8 no 6 pp 679ndash698 1986

[14] A Hyvarinen ldquoFast and robust fixed-point algorithms forindependent component analysisrdquo IEEE Transactions on NeuralNetworks vol 10 no 3 pp 626ndash634 1999

[15] K G M M Alberti and P Z Zimmet ldquoDefinition diagnosisand classification of diabetesmellitus and its complications Part1 diagnosis and classification of diabetes mellitus Provisionalreport of aWHO consultationrdquoDiabetic Medicine vol 15 no 7pp 539ndash553 1998

[16] S Vijan T P Hofer and R A Hayward ldquoCost-utility Analysisof screening intervals for diabetic retinopathy in patients withtype 2 diabetes mellitusrdquo Journal of the American MedicalAssociation vol 283 no 7 pp 889ndash896 2000

[17] K Shotliff and G Duncan ldquoDiabetic retinopathy summary ofgrading and management criteriardquo Practical Diabetes Interna-tional vol 23 no 9 pp 418ndash420 2006

[18] httparchiveicsucieduml[19] B Antal and A Hajdu ldquoAn ensemble-based system for auto-

matic screening of diabetic retinopathyrdquo Knowledge-Based Sys-tems vol 60 pp 20ndash27 2014


[20] M Dabramoff J Mreinhardt and S Rrussell ldquoAutomated earlydetection of diabetic retinopathyrdquo Ophthalmology vol 117 no6 pp 1147ndash1154 2010

[21] A D Fleming K A Goatman S Philip J A Olson and P FSharp ldquoAutomatic detection of retinal anatomy to assist diabeticretinopathy screeningrdquo Physics in Medicine and Biology vol 52no 2 pp 331ndash345 2007

[22] A Sopharak B Uyyanonvara and S Barman ldquoAutomatic exu-date detection for diabetic retinopathy screeningrdquo ScienceAsiavol 35 no 1 pp 80ndash88 2009

[23] A Krogh and J Vedelsby ldquoNeural network ensembles cross val-idation and active learningrdquo in Neural Information ProcessingSystems 1995

[24] H Sseung M Opper and H Sompolinsky ldquoQuery by commit-teerdquo in Proceedings of the 5th Annual Workshop on Computa-tional Learning Theory Pittsburgh Pa USA July 1992

[25] G-B Huang Q-Y Zhu and C-K Siew ldquoExtreme learningmachine theory and applicationsrdquoNeurocomputing vol 70 no1ndash3 pp 489ndash501 2006

[26] G-B Huang H Zhou X Ding and R Zhang ldquoExtremelearning machine for regression and multiclass classificationrdquoIEEE Transactions on Systems Man and Cybernetics Part BCybernetics vol 42 no 2 pp 513ndash529 2012

[27] T G Dietterich ldquoAn experimental comparison of three meth-ods for constructing ensembles of decision trees baggingboosting and randomizationrdquo Machine Learning vol 40 no2 pp 139ndash157 2000

[28] GKovacs andAHajdu ldquoExtraction of vascular system in retinaimages using averaged one-dependence estimators and orienta-tion estimation in hiddenmarkov randomfieldsrdquo inProceedingsof the 8th IEEE International Symposium on Biomedical ImagingFrom Nano to Macro pp 693ndash696 IEEE Chicago Ill USAApril 2011

[29] B Antal A Hajdu Z Maros-Szabo Z Torok A Csutakand T Peto ldquoA two-phase decision support framework forthe automatic screening of digital fundus imagesrdquo Journal ofComputational Science vol 3 no 5 pp 262ndash268 2012

[30] B Antal I Lazar and A Hajdu ldquoAn ensemble approachto improve microaneurysm candidate extractionrdquo in SignalProcessing and Multimedia Applications vol 222 of Commu-nications in Computer and Information Science pp 378ndash394Springer Berlin Germany 2012

[31] B Nagy B Harangi B Antal and A Hajdu ldquoEnsemble-basedexudate detection in color fundus imagesrdquo in Proceedings of the7th International Symposium on Image and Signal Processing andAnalysis (ISPA rsquo11) pp 700ndash703Dubrovnik Croatia September2011

[32] B Antal and A Hajdu ldquoA stochastic approach to improvemacula detection in retinal imagesrdquo Acta Cybernetica vol 20no 1 pp 5ndash15 2011

[33] R J Qureshi L Kovacs B Harangi B Nagy T Peto and AHajdu ldquoCombining algorithms for automatic detection of opticdisc and macula in fundus imagesrdquo Computer Vision and ImageUnderstanding vol 116 no 1 pp 138ndash145 2012

[34] C Agurto V Murray E Barriga et al ldquoMultiscale AM-FM methods for diabetic retinopathy lesion detectionrdquo IEEETransactions on Medical Imaging vol 29 no 2 pp 502ndash5122010

[35] G P Leese ldquoRetinal photography screening for diabetic eyediseaserdquo Tech Rep British Diabetic Association 1997

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014


MEDIATORSINFLAMMATION

of


Behavioural Neurology

EndocrinologyInternational Journal of



Disease Markers


BioMed Research International

OncologyJournal of



Oxidative Medicine and Cellular Longevity


PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of



Computational and Mathematical Methods in Medicine

OphthalmologyJournal of


Diabetes ResearchJournal of



Research and TreatmentAIDS


Gastroenterology Research and Practice


Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom


(a) DR grade I (b) DR grade II (c) DR grade III

Figure 1 Representative images having different grades (I II and III)

(a) DR grade IV (b) DR grade V (c) DR grade VI

Figure 2 Representative images having different grades (IV V and VI)

However to build an automatic computer-aided screen-ing system raised a financial problem [16] A DR screeningsystem faces three major requirements nowadays Firstwhen a company builds a DR screening system for medicalpurpose the accuracy is a keymeasurement Second hospitaladministrators need that this DR system not only can makeclassification automatically but also can save more moneyand time when it is running in the future Third the DRscreening system should raise meaningful queries to doctorsas many as possible and cases that can be easily diagnosed bycomputer should be queried as little as possible Therefore aDR screening system should further have the following threecharacters (1) more accuracy (2) smaller training datasetand (3) active learning

For solving the above problems we propose an ensemble-kernel extreme learningmachine (KELM) based active learn-ing with querying by committee classifier Below are themajor contributionsconclusions of our work

(1) Retinal image is easy to snap but manually diagnos-ing a result is of high cost

(2) Kernel technique is suitable for classifying retinalimages which is related to classification in highdimensional spaces

(3) Ensemble learning (bagging technique) can ele-vate classifierrsquos performance Particularly overfittingoccurs when training set is small

(4) Active learning can further reduce the size of trainingdataset compared to traditional machine learningmethod in DR screening system

(5) The committee can avoid unnecessary queries todoctor this is distinctive to other state-of-the-art DRscreening systems

This paper is organized as follows Section 2 showsbackground of retinal images and related works Section 3presents the details of the proposed classifier and Section 4presents empirical experiment and results Conclusions aredrawn in the final section

2 Retinal Images and Related Works

21 Retinal Image and Detections Figure 1 shows DR grade[17] I II and III Figure 2 shows DR grade IV V and VIMicroaneurysm appears as tiny red dots in Figure 1 withthe worsening of diabetes exudates occur as primary signsof diabetic retinopathy In Figures 1 and 2 inhomogeneityappears and it can lead to loss of sight

Doctors give diagnosis results based on 3 major lesionsmicroaneurysm exudates and inhomogeneity Moreoverthere are two useful anatomical detections macula and opticdisc In Table 1 five essential detections of DR screening arelisted

22 Classic DR Screening System Architecture DR screeningsystem [19] captures retinal images and gives diagnosisresults The classic architecture of DR screening system isshown in Figure 3

A high resolution camera is used for capturing retinalimages Then retinal images are saved into storage systemUsually there is a preprocessing for retinal images this


Light



Featureextraction

Classifier

Diagnosis result

Classificationplane


Positiveside = 1

minus1






















Bagging technique

Committee

Testinginstances

(decreasingly)

Traininginstances

(increasingly)

Artificial labeling

ActionData


Traininginstances






119894 119905119894) 119894 = 1 2


119894=

[1199091 1199092 1199093 119909

119899]119879

isin 119877119899 and 119905



119872

sum

119894=1

120573119894119892119894(119909119895) =

119872

sum

119894=1

120573119894119892 (119908119894+ 119909119895+ 119887119894) = 119905119895

119895 = 1 2 119873

(1)







119866 (119886 119887 119909) =1

1 + exp (minus (119886 sdot 119909 + 119887)) (2)


119866 (119886 119887 119909) = sin (119886 sdot 119909 + 119887) (3)


119866 (119886 119887 119909) =

1 if 119886 sdot 119909 minus 119887 ge 0

0 otherwise(4)


119866 (119886 119887 119909) = exp (minus119887 119909 minus 1198862) (5)


119866 (119886 119887 119909) = (119909 minus 1198862+ 1198872)12

(6)


119867120573 = 119879 (7)


119867 =

[[[[

[

119892 (11990811199091+ 1198871) sdot sdot sdot 119892 (119908

1198731199091+ 119887119873)

d

119892 (1199081119909119872+ 1198871) sdot sdot sdot 119892 (119908

119873119909119872+ 119887119873)

]]]]

]

(8)







min120573

1003817100381710038171003817119867120573 minus 1198791003817100381710038171003817 (9)


= 119867minus1119879 (10)



120573 = 119867119879(1

119862+ 119867119867

119879)

minus1

119879 (11)



119891 (119909) = ℎ (119909)119867119879(1

119862+ 119867119867

119879)

minus1

119879 (12)


119872 = 119867119867119879 119898119894119895= ℎ (119909

119894) ℎ (119909119895) = 119896 (119909

119894 119909119895) (13)


119891 (119909) = [119896 (119909 1199091) 119896 (119909 119909119873)] (1

119862+119872)

minus1

119879 (14)



follows


119896 (119909 119910) = (1 + ⟨119909 119910⟩)119889 (15)



(21198862)) (16)


119896 (119909 119910) = tanh (119901 ⟨119909 119910⟩ + 119902) (17)




119894 119910119894) | 119894 = 1 2 119899 where 119909

119894




init


119894 119894 = 1 2 3 119872

training



119894with STR

119894

classification


119894 119894 = 1 2 3

119872













(18)














(8ndash15)


(16)











065

075

085

06

07

08


065

075

085

06

07

08

(b)



065

075

085

06

07

08


065

075

085

06

07

08

(b)








065

07

075

08

085

(a)

06

07

08


065

075

085

(b)




Figures 5ndash7








Max of column 0886 0812 0851













5 Conclusion



Competing Interests


References











































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of

















Light



Featureextraction

Classifier

Diagnosis result

Classificationplane


Positiveside = 1

minus1






















Bagging technique

Committee

Testinginstances

(decreasingly)

Traininginstances

(increasingly)

Artificial labeling

ActionData


Traininginstances






119894 119905119894) 119894 = 1 2


119894=

[1199091 1199092 1199093 119909

119899]119879

isin 119877119899 and 119905



119872

sum

119894=1

120573119894119892119894(119909119895) =

119872

sum

119894=1

120573119894119892 (119908119894+ 119909119895+ 119887119894) = 119905119895

119895 = 1 2 119873

(1)







119866 (119886 119887 119909) =1

1 + exp (minus (119886 sdot 119909 + 119887)) (2)


119866 (119886 119887 119909) = sin (119886 sdot 119909 + 119887) (3)


119866 (119886 119887 119909) =

1 if 119886 sdot 119909 minus 119887 ge 0

0 otherwise(4)


119866 (119886 119887 119909) = exp (minus119887 119909 minus 1198862) (5)


119866 (119886 119887 119909) = (119909 minus 1198862+ 1198872)12

(6)


119867120573 = 119879 (7)


119867 =

[[[[

[

119892 (11990811199091+ 1198871) sdot sdot sdot 119892 (119908

1198731199091+ 119887119873)

d

119892 (1199081119909119872+ 1198871) sdot sdot sdot 119892 (119908

119873119909119872+ 119887119873)

]]]]

]

(8)







min120573

1003817100381710038171003817119867120573 minus 1198791003817100381710038171003817 (9)


= 119867minus1119879 (10)



120573 = 119867119879(1

119862+ 119867119867

119879)

minus1

119879 (11)



119891 (119909) = ℎ (119909)119867119879(1

119862+ 119867119867

119879)

minus1

119879 (12)


119872 = 119867119867119879 119898119894119895= ℎ (119909

119894) ℎ (119909119895) = 119896 (119909

119894 119909119895) (13)


119891 (119909) = [119896 (119909 1199091) 119896 (119909 119909119873)] (1

119862+119872)

minus1

119879 (14)



follows


119896 (119909 119910) = (1 + ⟨119909 119910⟩)119889 (15)



(21198862)) (16)


119896 (119909 119910) = tanh (119901 ⟨119909 119910⟩ + 119902) (17)




119894 119910119894) | 119894 = 1 2 119899 where 119909

119894




init


119894 119894 = 1 2 3 119872

training



119894with STR

119894

classification


119894 119894 = 1 2 3

119872













(18)














(8ndash15)


(16)











065

075

085

06

07

08


065

075

085

06

07

08

(b)



065

075

085

06

07

08


065

075

085

06

07

08

(b)








065

07

075

08

085

(a)

06

07

08


065

075

085

(b)




Figures 5ndash7








Max of column 0886 0812 0851













5 Conclusion



Competing Interests


References











































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of



















Bagging technique

Committee

Testinginstances

(decreasingly)

Traininginstances

(increasingly)

Artificial labeling

ActionData


Traininginstances






119894 119905119894) 119894 = 1 2


119894=

[1199091 1199092 1199093 119909

119899]119879

isin 119877119899 and 119905



119872

sum

119894=1

120573119894119892119894(119909119895) =

119872

sum

119894=1

120573119894119892 (119908119894+ 119909119895+ 119887119894) = 119905119895

119895 = 1 2 119873

(1)







119866 (119886 119887 119909) =1

1 + exp (minus (119886 sdot 119909 + 119887)) (2)


119866 (119886 119887 119909) = sin (119886 sdot 119909 + 119887) (3)


119866 (119886 119887 119909) =

1 if 119886 sdot 119909 minus 119887 ge 0

0 otherwise(4)


119866 (119886 119887 119909) = exp (minus119887 119909 minus 1198862) (5)


119866 (119886 119887 119909) = (119909 minus 1198862+ 1198872)12

(6)


119867120573 = 119879 (7)


119867 =

[[[[

[

119892 (11990811199091+ 1198871) sdot sdot sdot 119892 (119908

1198731199091+ 119887119873)

d

119892 (1199081119909119872+ 1198871) sdot sdot sdot 119892 (119908

119873119909119872+ 119887119873)

]]]]

]

(8)







min120573

1003817100381710038171003817119867120573 minus 1198791003817100381710038171003817 (9)


= 119867minus1119879 (10)



120573 = 119867119879(1

119862+ 119867119867

119879)

minus1

119879 (11)



119891 (119909) = ℎ (119909)119867119879(1

119862+ 119867119867

119879)

minus1

119879 (12)


119872 = 119867119867119879 119898119894119895= ℎ (119909

119894) ℎ (119909119895) = 119896 (119909

119894 119909119895) (13)


119891 (119909) = [119896 (119909 1199091) 119896 (119909 119909119873)] (1

119862+119872)

minus1

119879 (14)



follows


119896 (119909 119910) = (1 + ⟨119909 119910⟩)119889 (15)



(21198862)) (16)


119896 (119909 119910) = tanh (119901 ⟨119909 119910⟩ + 119902) (17)




119894 119910119894) | 119894 = 1 2 119899 where 119909

119894




init


119894 119894 = 1 2 3 119872

training



119894with STR

119894

classification


119894 119894 = 1 2 3

119872













(18)














(8ndash15)


(16)











065

075

085

06

07

08


065

075

085

06

07

08

(b)



065

075

085

06

07

08


065

075

085

06

07

08

(b)








065

07

075

08

085

(a)

06

07

08


065

075

085

(b)




Figures 5ndash7








Max of column 0886 0812 0851













5 Conclusion



Competing Interests


References











































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of





















min120573

1003817100381710038171003817119867120573 minus 1198791003817100381710038171003817 (9)


= 119867minus1119879 (10)



120573 = 119867119879(1

119862+ 119867119867

119879)

minus1

119879 (11)



119891 (119909) = ℎ (119909)119867119879(1

119862+ 119867119867

119879)

minus1

119879 (12)


119872 = 119867119867119879 119898119894119895= ℎ (119909

119894) ℎ (119909119895) = 119896 (119909

119894 119909119895) (13)


119891 (119909) = [119896 (119909 1199091) 119896 (119909 119909119873)] (1

119862+119872)

minus1

119879 (14)



follows


119896 (119909 119910) = (1 + ⟨119909 119910⟩)119889 (15)



(21198862)) (16)


119896 (119909 119910) = tanh (119901 ⟨119909 119910⟩ + 119902) (17)




119894 119910119894) | 119894 = 1 2 119899 where 119909

119894




init


119894 119894 = 1 2 3 119872

training



119894with STR

119894

classification


119894 119894 = 1 2 3

119872













(18)














(8ndash15)


(16)











065

075

085

06

07

08


065

075

085

06

07

08

(b)



065

075

085

06

07

08


065

075

085

06

07

08

(b)








065

07

075

08

085

(a)

06

07

08


065

075

085

(b)




Figures 5ndash7








Max of column 0886 0812 0851













5 Conclusion



Competing Interests


References











































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of























(18)














(8ndash15)


(16)











065

075

085

06

07

08


065

075

085

06

07

08

(b)



065

075

085

06

07

08


065

075

085

06

07

08

(b)








065

07

075

08

085

(a)

06

07

08


065

075

085

(b)




Figures 5ndash7








Max of column 0886 0812 0851













5 Conclusion



Competing Interests


References











































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of


















065

075

085

06

07

08


065

075

085

06

07

08

(b)



065

075

085

06

07

08


065

075

085

06

07

08

(b)








065

07

075

08

085

(a)

06

07

08


065

075

085

(b)




Figures 5ndash7








Max of column 0886 0812 0851













5 Conclusion



Competing Interests


References











































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of


















065

07

075

08

085

(a)

06

07

08


065

075

085

(b)




Figures 5ndash7








Max of column 0886 0812 0851













5 Conclusion



Competing Interests


References











































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of





















5 Conclusion



Competing Interests


References











































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of






































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of





















of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of
















Documents

Research Article An Active Learning Classifier for Further ...downloads.hindawi.com/journals/cmmm/2016/4345936.pdf · three major parts: image processing [], ... to build an automatic