New CHAPTER 5 WBC IMAGE SEGMENTATION AND …shodhganga.inflibnet.ac.in/bitstream/10603/40775/10/10... · 2018. 7. 2. · procedure for detection and segmentation of blood smear particles

128

CHAPTER 5

WBC IMAGE SEGMENTATION AND CLASSIFICATION

USING RELEVANCE VECTOR MACHINE

5.1 INTRODUCTION

Medical Image Segmentation becomes vital process for its proper

detection and diagnosis of diseases. In which accurate White Blood Cells

segmentation becomes important issue because differential counting, plays a

major role in the determination the diseases and based on it the treatment is

followed for the patients. The Standard Modified Fuzzy Possibilistic C-Means

is used for segmentation. A WBC image classification method is based on

Relevance Vector Machines (RVMs) is given by Alagappan et al (2012). It is

proposed to use a Fast RVM based approach for the segmentation of WBC

images. Modified RVM is much faster testing time, compared to standard

RVM based classification. Modified RVM based classification approach is

more suitable for applications that require low complexity, possibly and real-

time classification. The proposed methodology of WBC classification using

MRVM is shown in Figure 5.1.

Figure 5.1 Proposed Methodology

Blood Cell

Images

Segmentation of WBC

Feature Extraction

Modified RVM

Classification

129

The Modified RVM has an identical functional form to the Fast

Relevance Vector Machine, but provides probabilistic classification. Firstly,

the astonishingly sparse Relevance Vectors (RVs) are obtained while fitting

the 1Dhistogram by MRVM. Finally the entire connective WBC regions are

segmented from the original image. It also has the advantages, such as high

computation efficiency and no extra parameter setting.

5.2 WBC DETECTION

The automatic detection of white blood cells (WBCs) still remains

as an unsolved issue in medical imaging. The analysis of WBC images has

engaged researchers from fields of medicine and computer vision alike. This

work presents an algorithm for the automatic detection of WBC embedded in

complicated and cluttered smear images that considers the complete process

as an image detection problem. The approach, which is based on the proposed

algorithm, transforms the detection task into an optimization problem.

Although detection algorithms based on optimization approaches present

several advantages in comparison to traditional approaches, they have been

scarcely applied to WBC detection.

5.3 FEATURE EXTRACTION

In Image processing, feature extraction is a particular variety of

dimensionality reduction. When the input information to an algorithm is too

huge to be processed, then the input information will be transformed into a

reduced representation set of features. Transforming the input data into the set

of features is called feature extraction. The cell feature extraction is based on

four main groups.

130

They are:

Textural

Colour

Geometrical

Shape

5.3.1 Texture Feature

In image any two pixels are assumed as a and b, the spatial

relationship between the two points is d x, y), the probability of the gray

levels of a and b are i and j respectively is pd(i,j), after all pixels in image are

traversed, the gray level co-occurrence matrix of image can be acquired. It

can be seen that gray level co-occurrence matrix is texture analysis method on

the basis of estimating the image of the second order combination conditional

probability density function, where contrast, entropy, angular second moment

and deficit moment are adopted as texture features of image. The mean value,

variance, skewness, and kurtosis are used for the texture features.

Contrast reflects image clarity and depth of texture grooves as,

(5.1)

Entropy value reflects image information amount, if there is no

defect in surface, there is almost no texture information and the entropy value

is nearly zero as,

(5.2)

131

Angular second moment reflects image gray levels distribution and

thickness of texture as,

(5.3)

Deficit moment reflects image texture homogeneity which measures image

local texture changes as,

(5.4)

After image generated gray level co-occurrence matrix, the above

four parameters are calculated and the gray texture feature vector of image is

acquired.

5.3.2 Colour Feature

Colour histogram method, colour aggregation vector method and

colour set method are common colour feature extraction methods, but the

acquired feature vector dimension of these methods and the algorithm

complexity are high. Where colour moment method is adopted to describe

image colour characteristics, because image colour distribution is focused on

low order moments, where first order moment describes average colour,

second order moment describes colour variance and third order moment

describes colour shift properties as Equations (5.5), (5.6) and (5.7)

(5.5)

132

(5.6)

(5.7)

Where Pij represents the probability of the pixels of gray level j emerging in

the i colour channel component, and N represents the number of image pixels.

In this work HSI colour space is adopted because it is closer to the human eye

for colour perception, and the image colour information is mainly focused on

the components of chrominance H and saturation S, so the first three colour

moments of components H and I are computed, and 6-D feature vector can be

acquired.

5.3.3 Geometrical Feature

In geometrical features are widely used, as various blood cells

differ greatly by their size or nucleus shape. Geometrical features are

computed on the basis of Region of Interest ROI (R) , which has well defined

closed boundary composed of a set of geometrical features such as area,

perimeter, centroid, tortuosity (Area/Perimeter), Compactness C-given by the

formula: perimeter2/area and radius. In this work, a simple step-by-step

procedure for detection and segmentation of blood smear particles has been

presented using the geometrical features.

5.3.4 Shape Feature

Shape does not refer to the shape of an image but to the shape of a

particular region that is being sought out. Shapes will often be determined

first applying segmentation or edge detection to an image. Other methods use

133

shape filters to identify given shapes of an image. In some case accurate shape

detection will require human intervention because methods like segmentation

are very difficult to completely automate.

5.4 SUPPORT VECTOR MACHINE (SVM)

Support Vector Machines are a set of supervised learning methods

used for classification, regression

The advantages of support vector machines are

Effective in high dimensional spaces

Still effective in cases where a number of dimensions is greater

than the number of samples

Memory efficient: Uses a subset of training points in the

decision function (called support vectors)

Versatile: different kernel functions can be specified for the

decision function. Common kernels are provided, but it is also

possible to specify custom kernels.

The disadvantages of support vector machines include:

If the number of features is much greater than the number of

samples, the method is likely to give poor performances

SVMs do not directly provide probability estimates, these are

calculated using an expensive five-fold cross-validation.

5.4.1 SVM Application

SVMs are currently among the best performers for a number of

classification tasks ranging from text to genomic data. SVMs can be applied

134

to complex data types beyond feature vectors (e.g. graphs, sequences, and

relational data) by designing kernel functions for such data. Its techniques

have been extended to a number of tasks such as regression, principal

component analysis, etc. Most popular optimization algorithms for SVMs use

decomposition to hill-climb over a subset of i s at a time, e.g. SMO. Tuning

SVMs remains a black art: selecting a specific kernel and parameters is

usually done in a try-and-see manner.

Today, the researches on image surface defects are focused on

defects detection, but hardly focused on defects classification on the basis of

determining defect area correctly; the thesis focuses on surface defects

classification in order to meet the requirements of grading standards, then

achieves automatic image classification.

5.5 RELEVANCE VECTOR MACHINE (RVM)

Given a training data set of input-target pairs D={(xi , yi)}1i=1,

RVM follows the standard probabilistic formulation and assumes that the

targets are samples from the model with additive noise:

ti= y(xi;w) + i (5.8)

Where i are error terms which are generally assumed to be independent

identically distributed Gaussian variables with mean zeros and variance 2 .

The likelihood function can be written as:

(5.9)

Where is the design matrix with x1 x1T , in

which xi k xi x1 xi x1T and k is a kernel.

135

Maximum likelihood estimation of w and 2 from will generally

lead to severe over fitting, so RVM encodes a preference for smoother

functions by defining an automatic relevance determination Gaussian prior

over the weights:

(5.10)

The posterior over the weights is then obtained from Bayesian rule:

(5.11)

Where

By integrating the weights of the product: P t w 2 w RVM

obtains the marginal likelihood for the hyper-parameters:

(5.12)

Because the values of and 2 that maximize the function defined

in equation (5.12) cannot be obtained in closed form, RVM considers an

alternative formula for iterative re-estimation of and 2:

(5.13)

During re-estimation, many of the I approach infinity, and the

corresponding weights approach zeros, implying that the corresponding

136

i finally stabilize at some finite

numbers. The corresponding xi

demonstrated that the solution obtained by RVM is astonishingly sparse,

which is helpful for the improvement of our algorithm efficiency. That is one

of the reasons why they choose RVM as the fitting tool rather than SVM in

this work. Another important reason is that RVM needs no extra parameter

setting, which makes our method more convenient. In contrast, for SVM, it is

necessary to estimate the error/margin trade- C

-validation

procedure, which is wasteful both of data and computation.

In Mathematics, a Relevance Vector Machine (RVM) is a machine

learning technique that uses Bayesian inference to obtain parsimonious

solutions for regression and probabilistic classification. The RVM has an

identical functional form to the support vector machine, but provides

probabilistic classification.

It is actually equivalent to a Gaussian process model

with covariance function:

(5.14)

where is the kernel function (usually Gaussian), and are the input

vectors of the training set.

Compared to that of support vector machines (SVM), the Bayesian

formulation of the RVM avoids the set of free parameters of the SVM (that

usually require cross-validation-based post-optimizations). However, RVMs

use an Expectation Maximization (EM)-like learning method and are

therefore at risk of local minima. This is unlike the standard sequential

137

minimal optimization (SMO)-based algorithms employed by SVMs, which

are guaranteed to find a global optimum (of the convex problem).

However, despite its success, they can identify a number of

significant and practical disadvantages of the support vector learning

methodology:

Although relatively sparse, SVMs make unnecessarily liberal

use of basic functions since the number of support vectors

required typically grows linearly with the size of the training

set. Some form of post-processing is often required to reduce

computational complexity.

Predictions are not probabilistic. In regression the SVM outputs

a point estimate, and in classification, a 'hard' binary decision.

Ideally, they desire to estimate the conditional distribution p(t x)

in order to capture uncertainty in our prediction. In regression

this may take the form of 'error-bars', but it is particularly

crucial in classification where posterior probabilities of class

membership are necessary to adapt to varying class priors and

asymmetric misclassification costs. Posterior probability

estimates have been coerced from SVMs via post-processing,

although they argue that these estimates are unreliable.

It is necessary to estimate the error/margin trade-off parameter

'C' (and in regression, the insensitivity parameter ' ' too). This

generally entails a cross-validation procedure, which is wasteful

both of data and computation.

The kernel function K(x; xi) must satisfy Mercer's condition.

That is, it must be the continuous symmetric kernel of a positive

integral operator.

138

The 'relevance vector machine' (RVM) is a Bayesian treatment

which does not suffer from any of the above limitations. Specifically, they

adopt a fully probabilistic framework and introduce a prior over the model

weights governed by a set of hyper parameters, one associated with each

weight, whose most probable values are iteratively estimated from the data.

Sparsity is achieved because in practice they find that the posterior

distributions of many of the weights are sharply (indeed infinitely) peaked

around zero. They term those training vectors associated with the remaining

non-zero weights 'relevance' vectors, in deference to the principle of

automatic relevance determination which motivates the presented approach.

The most compelling feature of the RVM is that, while capable of

generalization performance comparable to an equivalent SVM, it typically

utilizes dramatically fewer kernel functions.

5.5.1 RVM Theory

The methods and the functions can be used for this RVM theory

can be discussed by Tzikas et al (2006).

A) Multi-kernel Relevance Vector Machine

Relevance vector machine (RVM) is a special case of a sparse

linear model, where the basic

centred at the different training points:

(5.15)

While this model is similar in form to the support vector machines (SVM), the

139

Multi-kernel RVM is an extension of the simple RVM model. It consists of

several different types of kernels m

(5.16)

The sparseness property enables automatic selection of the proper

kernel at each location by pruning all irrelevant kernels, though it is possible

that two different kernels remain on the same location.

B) Sparse Bayesian Prior

A sparse weight prior distribution can be obtained by modifying the

commonly used Gaussian prior, such that a different variance parameter is

assigned for each Weight:

(5.17)

Where = ( 1 M) is a vector consisting of M hyper parameters, which are

treated as independent random variables. A Gamma prior distribution is

assigned on these hyper parameters:

p( i) = Gamma (a,b) (5.18)

Where a and b are constants and are usually set to zero, which results in a flat

Gamma distribution. By integrating over the hyper parameters, they can

obtain the p(w a)p(a)da. The above integral gives a

student prior, which is known to enforce sparse representations, owing to the

fact that its mass is mostly concentrated near the origin and the axes of

definition.

140

C) Bayesian Inference

Assuming independent, zero-mean, Gaussian noise with variance -1, i.e.,

they have the likelihood of the observed data as:

(5.19)

N×N or an N×(N*M

multi kernel cases respectively. This matrix is formed by all the basis

functions evaluated at all the training points, i.e., x1 xNT where

xi 1 xi-x1 1 xi-xN ,.., M xi-xNT. (5.20)

In order to make predictions using the Bayesian model, the

parameter posterior distribution p(w t) needs to be computed.

Unfortunately, it cannot be computed analytically due to its complexity, and

approximations have to be made. The following procedure describes that they

decompose the parameter posterior as:

Where

(5.21)

(5.22)

141

and

A=diag( 1,.., M)

The posterior of the hyper parameters p t ) cannot be

computed analytically and is approximated by a delta function at its mode:

(5.23)

They can find MPand MPby maximizing:

Written as:

(5.24)

And

(5.25)

The term p ( t -II

likelihood and is computed by marginalizing the weights:

,

Which yields,

(5.26)

(5.27)

An alternative approach is to follow the variation Bayesian

methodology to obtain an approximation to the posterior parameter

142

distribution p(w t). This is demonstrated, but it is concluded that the method

achieves only slightly improved results at significant additional computations.

D) Marginal Likelihood Optimization for Fast RVM

MP cannot be solved analytically and

an iterative method has to be used. Instead of maximizing the hyper parameter

posterior, it is equivalent, and more convenient, to minimize its negative log

likelihood which for the multi kernel case is:

(5.28)

Where This equation whenM = 1 gives the single kernel

case.

Setting the derivative of L

(5.29)

mi is the mi- (mi)(mi) is the

mi-th diagonal element of the posterior weight covariance. At each iteration,

mi (mi)(mi) MP. Similarly,

the following formula can be obtained for the variance parameter:

(5.30)

O((NM)3) computations, which can be

very demanding for models with many basis functions. During the training

143

process, basic functions whose corresponding weights are estimated to be

and its inversion will be easier. However, there are M basis functions initially

that the iterative updates for the hyper parameters can also be derived using

an expectation-maximization (EM) algorithm by treating the weights w as

hidden variables and the observations t

observed variables.

E) Incremental Optimization

A more efficient approach is the incremental algorithm and it is

used in this process. The model is initially assumed to contain only one basic

function, and basic functions are incrementally added or deleted subsequently.

For the case of a flat prior on hyper parameter a, maximization of the

marginal likelihood is equivalent to maximizing:

(5.31)

Given a single hyper parameter i they can decompose L

(5.32)

Where L( -i i and

(5.33)

144

With while C i is matrix C with the contribution of

basis

Function i removed, i.e., . Analysis of l( i) shows that L

i:

(5.34)

Thus, they can find aMPby iteratively:

adding a basic i with qi2>si,

re- i for a basic function already in

the model, or

deleting a basic iwith qi2 si.

When adding a basic function or re-estimating the value of its hyper

parameter, they set

(5.35)

which maximizes L

Vectors s and q are calculated using an iterative algorithm that utilizes their

value from the previous iteration and the details of these calculations can be

found.

This incremental algorithm successfully overcomes the major

145

one basic function can be modified, significantly more iteration is required to

reach convergence. Convergence could be faster by choosing at each step to

modify the basic function that leads to the largest increase of the marginal

likelihood. However, this requires evaluating the marginal likelihood increase

for all the basic functions at each step and is computationally expensive.

Overall, the incremental algorithm is a major improvement over the initial

non incremental algorithm. However, it is still computationally demanding for

very large datasets.

5.5.2 Application for RVM

The RVM process is an iterative one and involves repeatedly re-

estimating and until a stopping condition is met.

Algorithm for RVM

The steps are as follows:

1. Select a suitable kernel function for the data set and relevant

parameters. Use this kernel function to create the design

matrix .

2. Establish suitable convergence criteria for and , e.g. a

threshold value for change Thresh between one iteration's

estimation of and the next = i=1 ain+1-ai

n so that re-

estimation will stop when < Thresh.

3. Establish a threshold value Thresh which it is assumed an i is

tending to infinity upon reaching it.

4. Choose starting values for and .

146

5. Calculate m = m = Tt and = (A+ T )-1.

6. Update

7. Prune the i and corresponding basic functions where

i> Thresh.

8. Repeat steps (5) to (7) until the convergence criteria is met.

Our hyper parameter values and which result from the above

procedure are those that maximize our marginal likelihood and hence those

are used when making a new estimate of a target value t for a new input x :

t = mT ( x ) (5.37)

The variance relating to our confidence in this estimate is given by:

2(x1) = -1 + (x1)T (x1) (5.38)

The algorithm is summarized below:

Step 1: Form a compact histogram from a given microscopic image;

Step 2: Use the RVM to approximate the above compact histogram and

obtain all the RVs;

Step 3: Seek the threshold from the RV set to ensure it to occupy the

deepest concavity;

Step 4: Use the so-obtained threshold to segment the given image;

Step 5: Perform morphological operations to the above image in order to

obtain the entire connective WBC region.

147

Given a training data set of input-target pairs ,

RVM follows the standard probabilistic formulation and assumes that the

targets are samples from the model with additive noise:

(10) (5.39)

Where are error terms which are generally assumed to be independent

identically distributed Gaussian variables with mean zeros and variance .

5.5.3 Decision Function in RVM

In the first part, the RVM algorithm and the

way to apply a multiple kernel strategy. The relevance vector machine is a

probabilistic sparse kernel model that has been introduced. The aim is to

reveal the underlying distribution of a set of data {xi, yi}i=1...n, where x Rd.

p(y|x)

standard deviation coming from the addition of a Gaussian noise : N(0,

. (5.40)

-e cient associated to each support vector.

So they can rewrite the probability of the data according to the parameters:

(5.41)

With a n × (n + 1) matrix containing the kernel and a bias : =

148

The key of this approach is to define a prior on each coefficient wi.

According to the Automatic Relevance Determination mechanism, all

coefficients which are unnecessary are pruned. This mechanism explained the

sparsity of the solution, since it prunes all parameters that add complexity to

the probabilistic model. By pruning coefficients, the likelihood is then

maximized regarding the input data. This presents the adaptation of this

algorithm for the classification case.

Similar to regression, RVM has also been used for classification.

Consider a two-class problem with training points X={x1,...,xN} and

corresponding class labels t={t1 ,...,tN } with ti {0,1}. Based on the

Bernoulli distribution, the likelihood (the target conditional distribution) is

expressed as:

(5.42)

y) is the logistic sigmoid function:

(5.43)

Unlike the regression case, however, the marginal likelihood p(t

can no longer be obtained analytically by integrating the weights from (1),

and an iterative procedure has to be used.

i denotes the maximum a posteriori (MAP) estimate of the

i. The MAP estimate for the weights, denoted by wMAP, can

be obtained by maximizing the posterior distribution of the class labels given

the input vectors.

149

This is equivalent to maximizing the following objective function:

(5.44)

Where the first summation term corresponds to the likelihood of the class

labels, and the second term corresponds to the prior on the parameters wi. In

the resulting solution, only those samples associated with nonzero coefficients

wi(called relevance vectors) will contribute to the decision function.

The gradient of the objective function J with respect to w is:

(5.45)

Where i,j=K(xi,xj) .

(5.46)

The Hessian of J is:

Where B= ( 1,..., N)is a diagonal matrix with .

The posterior is approximated around WMAPby a Gaussian

approximation with covariance,

(5.47)

and mean

(5.48)

150

These results are identical to the regression case and the hyper

i is updated iteratively in the same manner as for the regression

case.

The Relevance Vector Machine (RVM) technique has been applied

in many different areas of pattern recognition, including communication

channel equalization, head model retrieval, feature optimization, functional

neuro images analysis and facial expressions recognition. In this thesis, two

applications are discussed: the first concerns the application of large scale

multi kernel RVM for object detection in large scale images, while the second

deals with computer-aided diagnosis of micro calcifications in digitized

mammograms.

Computational techniques are now characteristically applied in the

field of turbo equipment. In the improvement of turbo machinery blades, a

variety of dedicated computer codes are required to assess and modify a

design before the prototype is made. Consequently, the accessibility of good

analysis and design codes is vital for manufacturers to stay at the forefront of

a very aggressive field of engineering. Inverse methodologies and practical

automatic optimization procedures make available a systematic means of

design, reducing the considerable time and cost frequently incurred in the

conventional technique of iterating between analyzing the design and

modifying the blade shape manually. Many 2-D inverse design methods are

obtainable, such as those in references, and are commonly used in the

preliminary stage of the design process for axial turbo machines. In modern

years, 3-D inverse design methods have emerged and have been applied

successfully for a wide assortment of designs, involving both turbo machinery

blades and wings.

151

5.6 CLASSIFICATION OF WHITE BLOOD CELLS

To classify the WBC to its respective subtype, it uses features that

describe the characteristics of the cytoplasm and the nucleus. This work

chooses set of features such as area, equidiameter, circularity, perimeter,

convex area, solidity, orientation, eccentricity, ratio of area of nucleus,

cytoplasm, majoraxislength/minoraxislength, separately evaluated for the

nucleus and the WBC. The result obtained from the previous step gives us

information about the broad nucleus type (segmented or nonsegmented). This

result is a novel binary feature added to our classifier. In addition features like

"circularity" (ratio between the perimeter of the tightest bounding circle and

the nuclear perimeter) of the nucleus, nucleus to cytoplasm ratio, ratio of

nucleus area to area of WBC, entropy of the cytoplasm, and mean gray-level

intensity of the cytoplasm (all three colour channels) are computed.

linear discriminant is used to reduce our multidimensional dataset to six

dimensions. It uses a linear discriminant in this six-dimensional space to

classify the data to their respective type.

Linear Discriminant Analysis (LDA) is used to find a linear

combination of the features which characterizes or separates these five classes

of WBCs. The classifier is biased using the number of samples in each class.

The system is evaluated using 10-fold cross-validation. Cross-validation is a

technique for assessing how the results of a statistical analysis will generalize

to an independent dataset. It is mainly used in settings where the goal is

prediction, and one wants to estimate how accurately a predictive model will

perform in practice. One round of cross-validation involves partitioning a

sample of data into complementary subsets, performing the analysis on one

subset (called the training set), and validating the analysis on the other subset

(called the validation set or testing set). To reduce variability, multiple rounds

of cross-validation are performed using different partitions, and the validation

152

results are averaged over the rounds. The functions from the Statistics

Toolbox in MATLAB have been used to analyze the data.

5.7 IMAGE ANALYSIS

Automatic recognition of white blood cells in light microscopic

images usually consists of four major steps, including: preprocessing, image

segmentation, feature extraction, and classification are shown by Rezatofighi

et al (2009). The pre-processing stage usually includes image enhancement of

acquired image and is essentially performed in order to prepare the image for

the vital segmentation stage. Individual objects of interest are separated from

the background in the segmentation process. This is followed by a labelling

operation (post-processing) in which, segmented objects of interest are tagged

with unique labels that can be used to count the number of objects in the

image. These labels along with spatial information of the segmented objects

are used for the subsequent feature extraction procedure. The geometrical

features are used to identify and classify the leukocyte cells, namely,

lymphocyte, monocyte, and neutrophil. The proposed method for the

segmentation and classification of blood cell (leukocytes) is given below

Algorithm 1: Training phase

Step 1: Input the leukocyte colour cell image.

Step 2: Convert the colour image into grayscale image.

Step 3: Apply histogram equalization on grayscale image.

Step 4: Perform pre-processing by using morphological operations,

namely, erosion, reconstruction and dilation.

Step 5: Segment the image of Step 4 by global thresholding and obtain

resulting binary image.

153

Step 6: Remove the border touching cells obtained in binary image and

then perform labelling the segmented binary image.

Step 7: For each labelled segment, compute geometric shape features

(area,MajorAxislength/MinorAxislength, perimeter, circularity) and

store them.

Let aik be the value of ith parameter for kth class. The i=1,2,3,4

correspond to area, MajorAxislength/MinorAxis length, perimeter,

circularity, respectively; and k=1,2,3 correspond to lymphocyte,

monocyte, neutrophil, respectively.

Step 8: Repeat Steps 1 to 7 for all the training images.

Step 9: Compute minimum and maximum values of features of leukocyte

cells, denoted by akimin and ak

imax,for all i and k, and store them as

knowledgebase.

Algorithm 2: Testing phase

Step 1: Input the leukocyte colour cell image.

Step 2: Convert the colour image into grayscale image.

The image processing techniques were applied to extract the needed

feature (e.g. size, colour

types of WBC.

Step 3: Apply histogram equalization on gray scale image.

Step 4: Perform pre-processing by using morphological operations,

namely, erosion, reconstruction, and dilation.

Step 5: Segment the image of Step 3 using global thresholding and obtain

resulting binary image.

154

Step 6: Remove the border touching cells obtained in binary image and

then perform labellingthe segmented binary image.

Step 7: For each labeled segment, compute geometric shape features ai,

i=1,2,3,4

Step 8: Apply rule for classification of the leukocyte cells; if ai lies in the

range [akimin, ak

imax], for i=1,2,3,4, then thecell (labeled segment)

belongs tokth class, wherek=1,2,3corresponds to lymphocyte,

monocyte, and neutrophil respectively.

Step 9: Repeat the Steps 7 and 8 for all labeled segments and output the

Classification of identified leukocyte cells.

An improved algorithm has also proposed for identification and

classification of white blood cells in digital microscopic images using colour

image segmentation method. The ratio of areas of nucleus and cytoplasm of a

cell as a prominent feature is presented. For a given input image, the colour

image analysis is carried out based on HSV model. The experimental results

are compared with the manual results obtained by pathologist. The

performance of the proposed algorithm is analyzed for four different feature

sets.

The input RGB image of leukocyte cell is converted into HSV

colour space and then only hue is considered. From the observation, it is clear

that the hue value for the cell lies between0.7 and 0.85. So, if the hue is

between 0.7 and 0.85, then cell portion is extracted which contains the

spurious regions along with it. These spurious regions can be eliminated by

removing regions whose total number of pixels is less than TA=1000. The

actual cell region is obtained after removing the spurious regions. Now,

nucleus has to be extracted from the already extracted cell region. Here, only

155

the saturation is considered to extract the nucleus. Empirically, it is observed

that, the nucleus has high saturation and it is above 0.45. Applying the above

thresholding, it yields binarized images of nucleus and that of cell region.

Finally, the cytoplasm region is obtained by subtracting binary

image of nucleus from that of cell. For the experimentation, they use three

feature sets, namely, F2, F3, and F4 for classification and compare with the

results obtained for feature set F1 of the previous method:

F2= (area, eccentricity, equivdiameter, perimeter, circularity)

F3= (area, eccentricity, equidiameter, perimeter, circularity,

ratio of areas of nucleus and cytoplasm

F4= (area, majoraxislength/minoraxislength, perimeter,

circularity, ratio of areas of nucleus, and cytoplasm)

The feature set F1= (area, majoraxislength / minoraxislength,

perimeter, circularity) is extended to F4 and F2 is extended to F3by

considering an extra feature, namely, the ratio of nucleus area and cytoplasm

area.

5.8 EXPERIMENTAL RESULTS

To evaluate the results of the techniques, the experiment is

conducted on various blood cell images. The blood cell image contains RBC,

WBC, and platelets. From those the WBC are alone segmented and its

number of WBC detected by various techniques is compared with actually

present in the image which is manually obtained. Out of the 85 images, 62

samples are considered as training data and remaining 23 samples as testing

data. The ground truth for the complete dataset collected from the pathologist.

156

A relevance Vector Machine classifier was compared with its

support vector counterpart, using the same Gaussian kernel. A value of C for

the SVM was selected.

The Table5.1 presents the geometric feature values computed for

the segmented leukocyte cells, namely, lymphocytes, monocytes, and

neutrophil of the image.

Table 5.1 The geometric feature values (F1) of the cell, regions of the images in Figure 1(c), (f) and (i)

Cell Types Area MajorAxisLength/ MinorAxisLength

Perimeter Circularity

Lymphocytes 883 1.0605 128 0.67734

Monocyte 1340 1.2850 220 0.34796

Neutrophil 1806 1.3428 314 0.34796

5.8.1 Accuracy and Processing Time of RVM

The Accuracy and Processing Time for SVM, RVM and Proposed

fast RVM are shown in Table 5.2. Thus the accuracy of the proposed

approach is to be high and processing time is to be less when compared to the

SVM.

Table 5.2 Accuracy and Processing Time for Fast RVM

Feature Set

Accuracy (%) Processing Time (Seconds)

SVM RVM Fast

RVM SVM RVM

Fast RVM

F1 80.23 85.12 92.53 59 34 20

F2 78.69 81.43 86.33 52 41 28

F3 73.32 80.54 83.78 48 35 19

F4 80.63 87.28 97.56 45 28 16

157

The accuracy of the SVM, RVM, and Fast RVM can be shown in Figure 5.2. The proposed Fast RVM has high accuracy when compare with other methods.

Figure 5.2 Accuracy for proposed Fast RVM

In Figure 5.3 represents the comparison of existing SVM, RVM, and proposed Fast RVM. The processing time is very low and the exact performance are rendering by using Fast RVM.

Figure 5.3 Processing Time for proposed Fast RVM

0102030405060708090

100

F1 F2 F3 F4

Acc

urac

y (%

)

Feature Set

SVM

RVM

Fast RVM

0

10

20

30

40

50

60

F1 F2 F3 F4

Proc

essi

ng T

ime

(Sec

ond)

Feature Set

SVM

RVM

Fast RVM

158

During the testing phase, the test values are compared with the

manual knowledge base for each type of blood cell. The classification

efficiency is measured for three feature sets namely F1, F2, and F3. Out of the

85 sample data, 62 samples are considered as training data and remaining 23

samples as testing data.

The comparison is made between the proposed Fast RVM and

standard RVM for testing efficiency. In the comparison given in Table 5.3 the

three feature selections are mentioned and taken as F1, F2, and F3.

Table 5.3 Testing efficiency Comparison

Feature Activation Function Testing Efficiency

RVM Fast RVM

F1

Unipolar sigmoid 67% 85%

Bipolar sigmoid 65% 83%

Radial basis kernel 63% 81%

F2




F3




These feature selections are based on area, perimeter, convex

length, diameter, and number of lobes. Activation functions are taken place

for this comparison and they are unipolar sigmoid, bipolar sigmoid, and radial

basis kernal. A testing efficiency of up to 85% is obtained for feature set F1

and around 89% in case of feature set F2. Comparing F1 and F2 sets, F3 gave

the maximum efficiency up to 91% for the proposed RVM. In this case,

exiting ELM gives the result of testing efficiency of up to 67% is obtained for

159

feature set F1 and around 69% in case of feature set F2. Comparing F1 and F2

sets, F3 gave the maximum efficiency up to 70%. From the above table it is

clearly noticed that the proposed method of Fast RVM gives the better result

than standard RVM. Therefore, overall testing efficiency of the proposed Fast

RVM has given better result in F3.

5.9 SUMMARY

A medical result support system known as Leuko has been

developed for leukemia diagnosis using a Naive Bayes classifier. The scheme

is able to distinguish six types of white blood cells (WBC), including a

malignancy. This research examines the use of Fast Relevance Vector

Machines (FRVMs) classifiers to identify WBC for future leukemia

diagnosis. Since RVMs are initially designed for the explanation of two class

problems, a number of strategies for their addition to this multiclass task are

examined and compared. The planned method uses discriminative shape,

colour and texture features, which evidently contains information for better

discrimination of bone marrow cells. Further, feature selection methods,

based on mutual information distribution and recursive feature removal along

with Fast Relevance Vector Machines (FRVM) are used for effective

classification. The results are analyzed and 93% has been achieved.

Documents

New CHAPTER 5 WBC IMAGE SEGMENTATION AND …shodhganga.inflibnet.ac.in/bitstream/10603/40775/10/10... · 2018. 7. 2. · procedure for detection and segmentation of blood smear particles