Classification of Abnormalities in Mammograms using ...joics.org/gallery/ics-2603.pdf · Pelin et al. [9] developed a wavelet based Support Vector Machine (SVM) method for capturing

Classification of Abnormalities in Mammograms using

Adaptive Approach

Sreedevi S*1, Terry Jacob mathew2 and Srikripa V3

1Associate Professor, Dept. of Computer Science, Sree Ayyappa College, Eramallikkara,

Chengannur, Kerala, India 2School of Computer Sciences, Mahatma Gandhi University,

Kottayam, Kerala, India and MACFAST 3Academic Assistant, IIITM-K, Kerala, India

1 [email protected], [email protected], [email protected]

Abstract

Breast cancer is one of the most deadly diseases of today, but the early detection of

breast cancer is vital to reduce its mortality rates. This paper proposes an automated

diagnosis of mammograms by categorizing them as benign, malignant or normal. The

proposed method concentrates on the algorithmic development of automated noise

removal, contrast enhancement, pectoral muscle removal, segmentation of Region of

Interest (ROI) in micro-calcification clusters, feature extraction, feature selection and

classification of mammograms. Fourteen textural and statistical features are extracted

from the segmented micro-calcification clusters using Gray Level Co-occurrence Matrix

(GLCM) for 00 angle and 3 pixel distances. A total of 7 features are finally utilized from

the 14 extracted features for classification. The classification performed using Naïve

Baye’s and Support Vector Machine classifier resulted in an accuracy of 94.89% and

87.35% respectively. The proposed method also benefited in dimensionality reduction,

reduced memory usage and time reduction, resulting in overall performance

enhancement.

Keywords: Breast Cancer, Digital Mammography, H -Domes transformation, CLAHE,

SVM, Naïve Baye’s.

1. Introduction

Breast cancer is one of the most deadly diseases of today and it is one of the leading

causes of mortality among woman, around the world. Mammography has emerged as a

major diagnostic procedure in the detection and screening of breast cancer [1]. Among

the large number of imaging modalities available today, high quality mammography is

considered as the most cost-effective and sensitive method for detecting breast cancers at

1 *Corresponding Author

Journal of Information and Computational Science

Volume 10 Issue 2 - 2020

ISSN: 1548-7741

www.joics.org1389

an early stage [2]. The detection of breast cancer at its early stage is important to reduce

mortality, because treatment in the early stages is found to significantly increase the

survival rate of patients [3]. Early detection of breast cancer can be achieved with the

timely screening of mammograms; however, this is heavily reliant on the correct

interpretation of mammograms by an experienced radiologist. Clustered micro-

calcifications present in mammographic X-ray images are an important indicator for

early detection of breast cancer [4]. These micro-calcifications are tiny granule-like

deposits of calcium with diameters up to about 0.1 mm and with an average diameter of

0.3 mm. A micro-calcification cluster is indicated by the presence of three or more

noticeable micro-calcifications within a square centimeter region of the mammogram [5].

Different methods with highly sophisticated algorithms have been developed for the

automatic detection of breast cancer in digital mammograms. Studies report that relevant

features extracted from the individual micro-calcifications [6] or from the Region of

Interest (ROI), which contains micro-calcification clusters [7] can detect breast cancers

accurately [8].

Pelin et al. [9] developed a wavelet based Support Vector Machine (SVM) method for

capturing information on micro-calcifications in digital mammograms and for the

classification of mammographic masses as benign or malignant. The masses were

segmented manually by radiologists and wavelet-based features are extracted from the

ROI. Final decision was taken by the classifier trained on the extracted features and

resulted in total classification accuracy of 84.8 %.

Bose et al. [10] presented a new method for the detection and classification of

micro-calcifications in mammogram images. This approach included four stages:

preprocessing, segmentation, feature extraction and classification. In preprocessing,

adaptive median filtering is used to remove noise from the image. Segmentation of

pectoral muscles and micro-calcifications is done with Fuzzy c-Means clustering (FCM)

algorithm. Nine features are extracted from the Low Low (LL) band of wavelet

transform and Artificial Neural Network (ANN) is used for classification. This work

used only conventional methods for detection and classification.

Deeba et al. [11] developed a new classification method for identifying

abnormalities in digital mammograms using Particle Swarm Optimized Wavelet Neural

Network (PSOWNN). They developed a detection algorithm based on texture energy

measures from mammograms. They implemented the algorithm with real clinical

database of 216 mammograms and the result gave an area under the Receiver Operating

Characteristic (ROC) curve for this algorithm as 0.96853, with a sensitivity and

specificity of 94.167% and 92.105% respectively.

Boulehmi et al. [12] proposed a micro-calcification (MC) detection system in

which they developed automatic methods for the enhancement of a mammogram by

using the method of galactophorous tree interpolation, segmentation of micro-

calcifications using Generalized Gaussian Density (GGD) estimation and a Bayesian

back-propagation neural network. Micro-calcifications were further classified using a

neuro-fuzzy system.



ISSN: 1548-7741

www.joics.org1390

Nashid Alam and Reyer Zwiggelaar [13] proposed a system for automatic

differentiation between benign and malignant MC clusters based on their morphology,

texture, and the distribution of individual and global features using an ensemble

classifier. The relevant features are fed into an ensemble classifier to classify the MC

clusters. The validity of the proposed method was investigated using Mammographic

Image Analysis Society (MIAS) and Digital Database for Screening Mammography

(DDSM) databases. The results indicate that the approach in [13] outperforms the current

state-of-the-art methods.

Birmohan Singh and Manpreet Kaur [14] proposed an approach which

enhances the region of interest, using morphological operations. They extracted two

types of features, related to cluster shape and cluster texture and applied SVM for

classification. A new set of shape features based on the recursive subsampling method is

added to the feature set, which improved the classification accuracy of the system. These

features are capable of differentiating malignant and benign tissue regions. To examine

the performance of the proposed approach, images are taken from DDSM database and

an accuracy of 94.25% was recorded.

The disadvantage of these methods is that they used generic existing methods

for removing noise, resulting in incomplete noise removal. Also these methods extracted

features from the segmented ROI for classification. Hence, in order to improve upon the

limitations of the related works in mammogram classification, a novel approach of

removing noise and pectoral muscles is introduced here. For classification, features are

taken from the micro-calcification clusters present in the ROI and are further segmented

from these clusters for obtaining accurate results.

The proposed work concentrates on the algorithmic development of automated

noise removal, contrast enhancement, pectoral muscle removal, finding the ROI,

segmentation of micro-calcification clusters from segmented ROI, feature extraction,

feature selection and classification of mammograms. The mammogram image is

denoised by a detection and filtering mechanism. Contrast Limited Adaptive Histogram

Equalization (CLAHE) is applied to enhance the contrast of intensity of the image, while

modified tracking algorithm is used for removing pectoral muscles. The method for

segmenting ROI in the mammograms uses a hierarchical fuzzy c-means clustering,

incorporated with a feature vector containing 14 features including statistical and textural

features extracted from pre-processed image. Adaptive H-Dome transformation with a

threshold is used to segment the micro-calcification clusters from the segmented ROI.

Fourteen textural and statistical features are extracted from the segmented micro-

calcification clusters using Gray Level Co-occurrence Matrix (GLCM). Out of 14

extracted features, 7 features are used for classification using Naïve Baye’s and SVM

classifier.



ISSN: 1548-7741

www.joics.org1391

2. Proposed Algorithm

2.1. Algorithm Steps

The algorithm for computer aided detection of breast cancer can be explained in the

following steps.

Step 1: Read the image.

Step 2: Flip the image if it is left oriented.

Step 3: Remove noise using MROR-ENLM algorithm.

Step 4: Remove pectoral muscles & background objects.

Step 5: Enhance the image using Contrast Limited Adaptive Histogram Equalization.

Step 6: Segment ROI’s that consists of micro-calcification a cluster using a hierarchical

fuzzy c means clustering with 14 features.

Step 7: Apply adaptive h-domes transformation and threshold the image and apply

morphological operations to identify micro-calcification clusters.

Step 8: Extract 14 features from the micro-calcification clusters.

Step 9: Select 7 relevant features using information gain attribute evaluator and ranker

search method.

Step 9: Perform the classification using Naïve Baye’s and SVM classifier.

2.2. Pre-processing

Mammogram images often contain noise, pectoral muscles and unwanted background

objects like name tags and other identification marks. Preprocessing stage deals with

noise removal, pectoral muscle & background objects removal. For noise removal, a

technique named Modified Robust Outlyingness Ratio with Extended Non Local Means

filter (MROR-ENLM) is used and for removing pectoral muscle & background objects, a

new tracking algorithm integrated with connected component labeling is employed.

Contrast-Limited Adaptive Histogram Equalization (CLAHE) is used to enhance the

contrast intensity of the image. CLAHE operates on small regions within the image,

called tiles, instead of the whole image.

2.3. Segmentation

A novel Feature Based Spatial Fuzzy c-means clustering Method (FBSFCM) is

implemented to segment the region of interests for further processing. Fourteen textural

and statistical features are extracted from the preprocessed mammogram images using

Gray Level Co-occurrence Matrix (GLCM) for 00 angle and 3 pixel distances. FBSFCM

method is implemented by incorporating both the features extracted from pre-processed

image and the spatial information to segment the correct region of interests for further



ISSN: 1548-7741

www.joics.org1392

processing. The extracted features used for segmentation are contrast, correlation,

energy, homogeneity, entropy, dissimilarity, autocorrelation; cluster prominence, cluster

shade, sum average, sum entropy, variance, information measures of Correlation1

(correlation1) and difference entropy. After segmenting the ROI, adaptive h-domes

transformation and a threshold based on the intensity classification of the image are used

to segment micro-calcification clusters in the ROI, based on the local maxima. The

algorithm selects all regional maximum in ROI and is independent of any size or shape

criterion. Fig. 1 represents the h-domes transformation. A regional maximum M of a

gray scale image I is a connected component of pixels with a given value h (plateau at

altitude h), such that every pixel in the neighborhood of M has a strictly lower value

[15]. For extracting “domes” of a given height, subtract arbitrary gray-level constant h

from I and is called h-domes. The value of h is not constant for all the images and this

value is higher for higher intensity images as it depends on the average intensity value of

the image. So the brightness variations in mammograms due to breast density differences

can be nullified after transformation. The h-dome image Dh(I) of the h-domes of a gray

scale image I is given by

Dh(I) = I − ρI(I − h) (1)

The value of ρ is between 0 and 1. The transformed image is converted to a

binary image by applying Otsu's method [17]. The obtained binary image is used to map

the original gray scaled image and the new image, thus obtained is free from background

intensities and contains only intensity peaks.

Figure 1. H-dome Transformation of Gray Sale Image I

2.4. Feature Selection

After segmenting micro-calcification clusters from the ROI obtained from the pre-

processed image, the next step is to extract the same fourteen features from the clusters

using GLCM for 00 angle and three pixel distance. This follows feature selection, which

is the process of removing irrelevant features and helps to reduce dimensionality of the



ISSN: 1548-7741

www.joics.org1393

feature. In this work, we used the technique of information gain attribute evaluator and

ranker search method to select attribute and rank them based on the filter and wrapper

method. Entry values vary from no information to maximum information. Those

attributes that contribute more information will have a better information gain value and

may be selected, whereas mammograms without much information will have a lower

score and can be removed. This selection process thus selects 7 relevant features out of

the fourteen features extracted using GLCM for classification. The selected features are

contrast, correlation, energy, homogeneity, autocorrelation, dissimilarity and entropy.

2.5. Classification

Classification is the most important step in automatic breast cancer detection

system. Various measurements based on co-occurrence matrix features, are given as

inputs to the classifier. Here SVM and Naïve Baye’s classifiers are used for

classification.

Naive Baye’s

Naïve Baye’s is one of the simplest density estimation methods from which we

can form one of the standard classification methods in machine learning. It works on the

basis of Baye’s theorem [18]. Compared to other classification algorithms, Naïve Baye’s

is optimal for accompanying multiple prior probabilities from the training set. According

to Baye’s Theorem, posterior probability, 𝑃(𝑐 𝑥)⁄ is calculated from 𝑃(𝑐) , 𝑃(𝑥)

and 𝑃(𝑥 𝑐)⁄ . Naive Baye’s classifier works on the assumption of conditional

independence which says that the effect of the value of a predictor (x) on a given class

(c) is independent of the values of other predictors.

𝑃(𝑐 𝑥)⁄ =𝑃(𝑥 𝑐)𝑃(𝑐)⁄

𝑃(𝑥)

Where, 𝑃(𝑐 𝑥)⁄ is the posterior probability.

𝑃(𝑐) is the class prior probability

𝑃(𝑥 𝑐)⁄ is the likelihood which is the probability of predictor given class.

𝑃(𝑥) is the predictor prior probability.

Support Vector Machine (SVM)

The SVM is a supervised learning algorithm that infers a function from a set of labelled

examples. The function takes new examples as input, and produces predicted labels as

output. The output of the algorithm is a mathematical function that is defined on the

space from which the examples are taken, and takes one of the two values at all points in

the space, corresponding to the two class labels that are considered in binary

classification. SVM is a classification and regression prediction tool that uses machine

learning theory to maximize predictive accuracy while automatically avoiding over-

fitting of the data. Suppose a set of data points that belong to one of two classes, and the

goal is to decide the class to which a new data point may belong. In SVM, a data point is

viewed as a p-dimensional vector (a list of p numbers), and we want to know whether we



ISSN: 1548-7741

www.joics.org1394

can separate such points with a (p − 1) dimensional hyper-plane. This is called a linear

classifier. There are many hyper planes that might classify the data. One reasonable

choice as the best hyper plane is the one that represents the largest separation, or margin,

between the two classes.

III. Experiment and Result

For testing, MIAS database [16] of 322 images that contains left and right breast images

of 161 patients were employed. These images include three types of images such as

normal, benign and malignant. Figure 2 represents original mammogram image taken

from MIAS database (a), corrupted image with 30% noise (b) and restored image using

MROR-ENLM (c). For removing pectoral muscles, right MLO mammograms are to be

flipped to left MLO and it is showed in Figure 3. Figure 4 shows the ground truth

marked by an expert radiologist, enhanced image, pectoral identified image and pectoral

removed image of mdb007 and mdn008. For segmenting the ROI that contain micro-

calcification clusters, the Feature Based Spatial Fuzzy C-Means clustering (FBSFCM)

method is used by integrating both spatial and feature information along with

conventional FCM. The result obtained by applying FBSFCM and corresponding

benchmark marked by a radiologist is given in Figure 5. The experiment is performed

with 322 images and the resulted accuracy measures are given in Table 1. Table 2 shows

comparison of accuracy measures of two methods and its graphical representation is

given in Figure 6.

(a) (b) (c)

Figure 2. (a) Original image mdb058 (b) mdb058 corrupted with30% noise. (c) ) Restored images of mdb058 using

MROR-ENLM



ISSN: 1548-7741

www.joics.org1395

(a) (b) (c)

Figure 3. (a) Original image mdb007 (b) Flipped image. (c) Enhanced image

Imag

e Id

Ground truth Enhanced

image

Pectoral

identified

image

pectoral muscle

removed

mdb

007

mdb

008

Figure 4. The Ground Truth marked by an Expert Radiologist, Enhanced

Image,Ppectoral Identified Image and Pectoral Removed Image



ISSN: 1548-7741

www.joics.org1396

Figure 5. Benchmark and the result of segmentation using FBSFCM

The performance of each classifier measured using the following formula:

Sensitivity =𝑇𝑃

𝑇𝑃+𝐹𝑁 (2)

Specificity = TN

TN+FP (3)

Accuracy = TP+TN

TP+FN+TN+FP (4)

Table -1 Experiment Results

Table -2 Comparisons of accuracy measures of two methods

Labelled

image

Segmented

ROI using

FBSFCM

Micro-

calcification in

FBSFCM

segmented ROI

Classifier No. of

images TP

TN FP FN

SVM 322 42 226 44 10

NB 322 46 229 39 8



ISSN: 1548-7741

www.joics.org1397

Method Sensitivity

(%)

Specificity

(%)

Accuracy

(%)

SVM

Classifier 80.76 83.70 83.22

Naive Baye’s

Classifier

85.18 85.44 85.40

Figure 6: Graphical representation of the accuracy comparison

IV.CONCLUSION

The development of adaptive Computer Aided Detection technique for segmenting

mammograms is highly desirable in order to assist radiologists in the interpretation of

abnormalities and to improve the diagnostic accuracy. Noise removal, pectoral muscle

removal and segmentation of micro-calcification clusters play an important role in the

detection of abnormalities in digital mammograms. Every suspicious object can be

detected using a binary image, which is used as a mask for object extraction from the

original image. Feature extraction methods are applied to select relevant features for

classification. Two classifiers are used to detect abnormalities in mammograms and the

results showed that Naive Baye’s classifier gave better results than the SVM classifier.

78.00

80.00

82.00

84.00

86.00

sensitivitySpecificity

Accuracy

In P

ere

nta

ge

Quality Measures

SVM

NB



ISSN: 1548-7741

www.joics.org1398

Reference

[1] R. J. Ferrari., R. M.. Rangayyan., J. E. Desautels, A. F. Borges, and Frere.,

“Automatic identification of the pectoral mu R. A., scle in mammo-grams”,

IEEE Trans on Medical Imaging, Vol. 23, no.2, pp. 232-245.

[2] R. Takiar, D. Nadayil, A. Nandakumar,.”Projections of number of cancer cases in

india (2010-2020) by cancer groups”, Asian Pac J Cancer Prevew, vol. 11, no.

4,(2010) .

[3] B. Verma and J. Zakos, “A Computer_Aided Diagnosis System for Digital

Mammograms Based on Fuzzy-Neural and Feature Extraction Techniques”. IEEE

Traansactions on Information Technology in Biomedicine, (2001), vol. 5, no. 1.

(2001)

[4] D. H., Davies and D. R Dance,. “Automaic computer detection of clustered

calcifications in digital mammograms”, Phys. Med. Biol. Vol. 35, no. 8, (1990),

pp. 1111-1118.

.

[5] B N BeenaUllala Mata and Dr. M Meenakshi“A Novel Approach for Automatic

Detection of Abnormalities in Mammograms”, Recent Advances in Intelligent

Computational Systems (RAICS), IEEE, (2011), Print ISBN: 978-1-4244-9478-1.

[6] N. Mudigonda and R. Rangayyan, “December). Detection of Breast Masses in

Mammograms by Density Slicing and Texture Flow-Field Analysis”, IEEE

TRANSACTIONS ON MEDICAL IMAGING, Vol 20, no.12, (2001).

[7] M. P. Sampat,. M. K. Markey and A. C. Bovik, “Computer-Aided Detection and

Diagnosis in Mammography. Handbook of Image and Video Processing”, (2005). ,

pp.1195-1217.

[8] M. Wirth, M, “Nonrigid approach to medical image registration matching images

of the breast”, RMIT University. Melbourne, Australia: RMIT University, (2000)

[9] G. Pelin, A. Serbas and G. Pelin, “Mammographic mass calculation using wavelet

based support vector machine”, Journal of Electrical and Electronics

Engineering”, Vol.9, no. 1, (2009)., pp. 867-875.

[10] S. C. Bose, K. R. Shankar Kumar, and M. Karnan, , “Detection of

Microcalcification in Mammograms using Soft Computing Tchniques”, European

Journal of Scientific Research, Vol. 86, no. 1,pp. 103-122.

[11] J. Deeba., N. Albert Singh and S. Tamil Selvi, “ Computer-aided detection of

breast cancer on mammograms: A swarm intelligence optimized wavelet neural

network approach”, Journal of Biomedical Informatics, Vol. 49, (2014), pp. 45-

52.



ISSN: 1548-7741

www.joics.org1399

[12] H. Boulehmi, H. Mahersia and K. A. hamrouni, “ New CAD System for Breast

Microcalcifications Diagnosis”, International Journal of Advanced Computer

Science and Applications, Vol. 7, no. 4, (2016)pp. 133-143.

[13] N. Alam and Zwiggelaar, “RAutomatic classification of clustered

morocalcification in digitized mammogram using ensemble learning”. The

Fourteenth International Workshop on Breast Imaging, (2018), Atlanta, Georgia,

United States. Georgia, United State.

[14] B. Singh and M. Kaur, “An approach for classification of malignant and benign

mirocalcification clusters”, Indian Academy of Sciences, Vol. 8, no.2, (2018), pp.

39-43.

[15] L. Vincent, “Morphological grayscale reconstruction in image analysis:

Applications and efficient algorithms”, IEEE Trans Image Process,Vol. 2,

(1993). Pp. i176-201.

[16] J. Suckling, J. Parker, D. R. Dance, S. Astley, I. Hutt and C. Boggis, C., “The

mammographic image analysis siciety digital mammogram database. 2nd

International Workshop on Digital Mammography” (1994).

[17] N. Suresh Chandra Satapathy, N. Sri Madhava Raja, V. Rajinikanth,

Amira S. Ashour and Nilanjan Dey, “Multi-level image thresholding using Otsu

and chaotic bat algorithm”, Neural Computing and Applications, Vol. 29, no.

12,(2018), pp. 1285-1307

[18] V. Priya , N. Sathya, “Classification and Prediction of Dermatitis Dataset Using

Naïve Bayes And Value Weighted Naïve Bayes Algorithms”, International

Research Journal of Engineering and Technology (IRJET), Vol. 06, no. 02,

(2019), pp. 1077-1081.



ISSN: 1548-7741

www.joics.org1400

Documents

Classification of Abnormalities in Mammograms using ...joics.org/gallery/ics-2603.pdf · Pelin et al. [9] developed a wavelet based Support Vector Machine (SVM) method for capturing