[IEEE 2012 International Conference on Advances in Computing and Communications (ICACC) - Cochin, Kerala, India (2012.08.9-2012.08.11)] 2012 International Conference on Advances in

COPD Prognosis under Biologically Inspired Neural Network

Komathy Karuppanan Easwari Engineering College

Chennai, India [email protected]

Abinaya Sree Vairasundaram Easwari Engineering College


Manjula Sigamani Easwari Engineering College


Abstract - This paper proposes a prognostic model for rehabilitating the chronic obstructive pulmonary disease (COPD) patients in real time. The proposed approach applies a comprehensive predictive model employing a time series forecasting using condensed polynomial neural network with swarm intelligence. Discrete particle swarm optimization (DPSO) filters out the relevant neurons and continuous particle swarm optimization (CPSO) reduces the computational overheads. The time series prediction is further strengthened by using multimodal genetic algorithm. Classification of the state of the patient is done by hybridized fuzzy C-means and support vectors. Control measures are applied meticulously to validate the predicted state of the patient.

Key words- Swarm Intelligence, Time Series Prediction, Support vector classifier, Condensed Polynomial Neural Network, Genetic Algorithm.

I. INTRODUCTION Prognosis or early diagnosis for chronic cases in

medical field is vital in today's health care scenario. A survey report [11] says that 268 chronic air flow restraint patients in India die every year in tertiary care centres due to lack of advanced monitoring. 17% of the patients die from chronic obstructive pulmonary disease (COPD). COPD prevalence is reportedly significant in 12 Asian countries including India. The number blows up particularly in winter [10]. The elderly living alone in Tamil Nadu State, India [9] as a special mention can be studied since the population is high here. The solution is to either employ a facilitator or to have an incessant monitoring at their homes by using ambient intelligent technologies. Typically, these methods send the data concerning the patient to a medical team for processing. From the time, the patient returns home from the hospital, after being treated for severe exacerbation, he/she undergoes rehabilitation therapy and registers for healthcare from thereon [12]. Healthcare system monitors the patient in real time from home and advises the medical team with vital information. This paper attempts to design such a healthcare system for the automatic prognosis of the patient's condition and control measures taken so forth. Currently, only a few healthcare systems are intelligent with prognostic information. Pasquale Arpaia et al [1] proposed an in-time prognosis method based on Particle Swarm Optimization (PSO) and Fuzzy Logic. The predicted physiological parameters are given PSO

weights and next state of the patient is fuzzified in accordance to the GOLD criteria [8]. In this paper, polynomial neural network (PNN) predictor is modified to have swarm intelligence for improved accuracy and reliability in prognosing.

II. DESIGN OF THE EMPIRICAL MODEL The physiological parameters considered for our

model to assess a patient with COPD are, FEV1%, BMI, MMRC scale, and 6 minute walk test. The proposed computational components of this system are shown in Figure 1. The physiological parameters of the patient are collected periodically. The parameters are obtained by wearable sensors for lung function monitoring.

A. Clustering The current state of the patient is clustered using fuzzy

C-Means (FCM) clustering algorithm hybridized with linear kernel based support vector classifier [5]. This method is an improvisation over Gaussian Kernel based support vector. Each value, Pr( ), after evaluation by means of PNN is given a weight using PSO in order to compute a number of criticality index (CI) values. CI provides a sound basis for deciding whether the patient is in precarious illness or not.

Figure 1. A comprehensive architecture of biological inspired

techniques used in prognosis

CI for each particle is obtained using the Equation (1),

CI( ) = (1)

where is the weight for every individual attribute of the particle in the range (0,1) and is the weight for predicted value of the particle in the range (1,10). FCM

2012 International Conference on Advances in Computing and Communications

978-0-7695-4723-7/12 $26.00 © 2012 IEEEDOI 10.1109/ICACC.2012.6

22

2012 International Conference on Advances in Computing and Communications

978-0-7695-4723-7/12 $26.00 © 2012 IEEEDOI 10.1109/ICACC.2012.6

22

uses the weights ( ij ) that minimize the total weighted mean-square error as given in Equation(2),

(2)

where 1 ≤ ≤ ∞, m is a real number greater than 1; ij is the degree of association of in the class j; is the ith d-dimensional measured data; j is the d-dimension centre of the cluster and ||*|| represents the Euclidean distance. Each particle is segregated through an iterative optimization of with the updated association function

ij and the class centres using the Equation (3) and Equation (4) respectively.

= 1 / (3)

j (4)

where ||*|| is the Euclidean distance given by Equation(5),

= (5)

Association function of the particle is a fuzzy truth value that lies between 0 and 1 which gives the degree of association between the particle and centres of classes, and confirms Equation(6),

= 1 (6)

Since each of the clusters could be multi leveled, the number of clusters is not exactly known. FCM uses a number of scalar measures such as partitioning entropy and proportion exponent to find out the probable number of clusters. Clustering is implemented in 3 layers. The first layer is the input layer comprising the four parameters. In the second layer, each of the nodes represents a basis function of each cluster. The membership value of the nodes is calculated as Equation(7).

(7)

where is the vector of input variables; n is the number of clusters, are the centers of the clusters; m is set to 1.25 as per experimentation results [5]. The third layer performs a weighted summation of the node values in the previous layer. The SVM learns by tuning the weight parameters for the generalization performance. Using random initialization approach, FCM and SVM are combined.

B. Prediction System This prediction model applies a condensed polynomial

neural network as given in the work of Dehuri et al [2] and is further altered to cater the needs of time series

prediction [6]. Discrete particle swarm optimization (DPSO) selects a relevant set of partial descriptions from the input layer and also obtains better accuracy in the prediction process whereas continuous particle swarm optimization (CPSO) optimizes the output in continuous domain. The process of time series prediction under PNN and swarm intelligence is briefed in the following subsections as shown in Figure 2.

1) DPSO for Input Layer Optimization: The noise level in the data is assessed by estimating the particle velocity by DPSO. The velocity is computed using Equation (8),

V[i] = (x[i] – x[i-1]) / (t[i] - t[i-1]) (8)

Velocity portrays the change in patient condition with a positive value for progress, negative value for deterioration and 0 for a stable state. The velocity computed also provides an input to the CPSO.

Figure 2. Condensed Polynomial Neural Network strengthened with

MGA

2) Time Series Prediction using MGA: The time series prediction is modelled by using PNN where each layer of the PNN gets optimized by MGA. The following subsections briefly describe the algorithm steps involved in the prediction process. Initial Population Set up: The physiological parameters obtained from the patient during the last 24 hours constitute initial population for the neural network. Operators of Genetic Algorithm: The subsequent layers of neural network are obtained recursively by implementing MGA. MGA algorithm produces an optimized output as multimodal points in a continuous time domain. Real-coded genetic algorithm (RGA) [7], when applied to real parameter spaces, has chromosomes as vectors of floating point numbers and alleles as real numbers. The multi-modality of the algorithm refers to a fitness sharing scheme where the fitness of the node depends on the niche count and shared radius as defined by Equation (9),

′ (9)

where is the original fitness of the node and is given by Equation (10).

2323

(10)

where Nv is the number of points in validating the set; x(t) is the actual value at time t and is the output of node at time t. is the niche count defined by Equation(11),

(11)

where N is the population size of RGA; is the distance between the nodes or the difference between i and j, the members of the population. The sharing distance given in Equation (12) labels as the distance between i and j, the members of the population and as 1 for a triangular sharing function. Based on the condition given in Equation (13), the shared radius is estimated.

(12)

= (13)

where l is the string length of the chromosome and q is the number of nodes in the previous layer of the network. The genetic operators such as single point crossover and uniform mutation are opted in the algorithm. In order to avoid overloading of post-crossover, the method of real-valued genetic algorithm [7] is exploited. The two chromosomes namely X1 and X2 are genetically crossed when they move closer as given in Equation (14) and Equation (15),

X1' = X1 + σ(X1 - X2) (14)

X2' = X2 - σ(X1 - X2) (15)

and when they move apart as listed in Equation (16) and Equation (17),

X1' = X1 + σ(X2 - X1) (16)

X2' = X2 - σ(X2 - X1) (17)

where X1' and X2' are the newly generated chromosomes; σ is the micro-random number controlling the variance of the crossover in RGA [7]. After crossover, uniform mutation is applied with the mutation parameter k. The mutation for a chromosome X is modelled in Equation(18) through Equation (20).

X = {x1, x2, .....,xn} (18)

xk' = LBk + r (UBk - LBk) (19)

X' = {x1 , x2, ...., xk', ....... xn} (20)

where r is the random number in the range (0,1) and LB and UB are the lower bound and upper bound of kth field in the chromosome. The new chromosomes are replaced in the population for subsequent reproduction. The elimination in RGA is based on fitness measurement of the nodes in terms of description length, L and is obtained as per Equation (21).

L = 0.5n log + 0.5 m log n (21)

where is the mean squared error; n is the number of observations used in mean squared error computation and m is the number of coefficients of the polynomial. The selection of the parents is based on the condition that the distance between the parents is greater than the shared radius so that the off-spring will be diverse by a greater extent. The terminating condition would reach either till the fitness value converges or the niche radius reaches 1 for the population. Update the radius of the PNN: The number of layers in a polynomial neural network is a trade-off between the accuracy of prediction and the complexity of the network. Equation (22) introduces a radius [4] with initial value for the network as,

r(0) = max(distance(e1,e2): e1,e2ЄD) (22)

where e1 and e2 are the data particles in the data domain, D. Subsequently the niche radius is updated using Equation (23).

r(i) = r(0)βi (23)

where β is a random constant in (0,1). The radius is calculated for every ith layer until r(n) reaches 1 or the MGA gives only one optimized solution. The output is in the de-normalized form and given as the predicted values of the patient's physiological parameters. The algorithm is applied for every 24 hour window and the error is propagated each time as an update in the coefficients of the polynomial. Experimentation results within 100 generations and β=0.05, show that the algorithm can produce an optimal predictive accuracy. CPSO for Output Optimization: CPSO is a group of particles exploring the best position in their neighbourhood where the behaviour of individual particle is attracted by the local best and global best particle. An individual can adapt its velocity and position from its past experience. Therefore, individual particle remembers the best visited position in the search space as in Figure 3.

For each patient, consider a flock F containing n particles (N = 1,2,...,n) in a d-dimensional continuous search space. Each ith particle of the flock communicates with a social environment or neighbourhood, N(i), which changes dynamically. Each ith particle reveals the attractiveness of its best position (Pbpi) and the best

2424

position (Nbpi) of neighbourhood N(i), by updating its velocity and its position according to the constrict factor method proposed by Clerc [2]. Equation (24) and Equation (25), derives vi and si at time (t+1).

Figure 3. CPSO algorithm

)] (24)

(25)

where t = 1,2,…..n , and n is the size of the flock; is the individual acceleration coefficient; is the societal acceleration coefficient; and are the two random sequences uniformly distributed in the range (0,1). The prognostic feature of PNN is enhanced by revising the predicted values Pr( ) through weight coefficients

obtained from the Equation (26),

(26)

so that each term × Pr( )) should be close to the fuzzified values ϑ( ) actually observed in the past i measurements for each physiological parameter. The best position of the particle is determined by minimizing the fitness function given by the Equation (27),

f = ϑ (27)

where Pr( ) is the predicted value and ϑ( ) is the fuzzified value of the particle.

III. EXPERIMENTAL VALIDATION The quality of the system is evaluated in terms of

accuracy, specificity and sensitivity. Sensitivity reveals the number of worsening states that are correctly predicted whereas the specificity accounts for the number of improving states that are correctly predicted and further, accuracy exposes the number of correct predictions as a whole. The quality of the prognostic system is assessed by means of statistical clinical tests as per Gibbs specifications [3]. Clinical tests are made using validation data set acquired from real time data collected from hospitals around the city Chennai, India. The quality of the prognostic system is then computed using the parameters from the validation set: W, which figures

degrading states that are correctly predicted; X is the unwavering states that are erroneously forecasted as degrading states; Y displays the unwavering states, which are erroneously estimated as improving states and Z is the improving states, which are precisely forecasted as improving states. Accuracy of forecast is given by Equation (28). Equation (29) gives the sensitivity of forecast. Specificity of forecast is given by Equation (30).

Ap = *100 (28)

Se = (29)

Sp = (30)

False positive rate is estimated from Equation (31), False negative rate is computed from Equation (32),

κ = 1 - Sp (31)

η = 1 - Se (32)

where (W+X+Y+Z) is the cardinality of the population, (W+Z) is the number of correct prognosis, (X+Y) is the number of incorrect prognosis. The proposed model is tested with real data acquired from a patient on hourly basis. Table 1 shows the comparative data between the validation and the real time.

The proposed method achieves an accuracy of 97.43% on real time. Moreover, investigating Table 1 shows a variance of 0.03 among the false positive rate and the false negative rate and thus the model ensures an improved accuracy in prediction. The validation of the model can be depicted by using a resilient state machine diagram as shown in Figure 4.

Figure 4. Resilient state machine diagram for the prognosis model

TABLE 1. RESULT ANALYSIS

State Size W X Y Z Ap% Se Sp Κ Η

Validation 500 259 10 6 222 96.78 0.97 0.95 0.05 0.03

Real time 120 72 2 1 42 97.43 0.98 0.95 0.05 0.02

Step 1: Initialize the centre for each class and the association vector U = [uij], U(0) Step 2: At each kth iteration, calculate the centre for each class, j Step 3: Update the association vector U(k),

= 1 / Step 4: If termination criterion becomes minimal, then STOP; otherwise return to step 2.

2525

Finite state machine (FSM) starts with any of the states marked N, M, H or S which represents Normal, moderate, high and severe respectively as state at which the patient is currently in. The states AL1 and AL2 are the two levels of Alert sent to the doctor. When there is a transition that is not reasonable under actual physical condition, the state machine transits to an invalid state, I. The state AA is an alert to the administrator indicating the malfunction of the system. The empirical system is validated by the logic of the state machine diagram.

IV. CONCLUSION A prognostic model for time critical monitoring of

COPD patients from home is proposed in this paper. The system has great efficiency in terms of accuracy and lower false prediction rate. A method to predict a critical condition of a patient, affected by a specific disease in the absence of primary clinician, based on swarm intelligence optimization procedure, a polynomial neural network prediction model enhanced with Multi-modal Genetic Algorithm, has been proposed and tested experimentally. Experimental investigation shows that the capability of the model achieves an early prognosis of the disease with a high accuracy.

Further work of this comprehensive machine learning approach is devoted to validate the model on other non-communicable diseases, and with a large number of physiological parameters to be optimized.

REFERENCE [1] P.Arpaia, C.Manna, G.Montenero, and G.D.Addio, “In-time

Prognosis based on Swarm Intelligence for Home-Care Monitoring: a Case Study on Pulmonary Diseas,” IEEE Transactions on Swarm Intelligence, Vol. 12, Issue.3 pp. 692 – 698, May 2011.

[2] M.Clerc, “The swarm and the queen: towards a deterministic and adaptive particle swarm optimization,” Proceedings of the Congress of Evolutionary Computation, Washington, pp. 1951-1957, 1995.

[3] A.L.Gibbs, and E.Braunwald, “Primary cardiology,” Saunders, Philadelphia.

[4] M.Jelasity and J.Dombi, "GAS, a Concept on Modelling Species in Genetic Algorithms," Artificial Intelligence, Elsevier, Amsterdam, vol.1, pp. 1-19, 1999.

[5] C.F.Juang, and C.D.Hsieh, “Fuzzy C-means based support vector machine for channel equalisation,” International Journal of General Systems, vol.38:3, pp. 273-289, 2009

[6] P.Liatsis, A.Foka, J.Y.Goulermas and L.Mandic, “Adaptive Polynomial Neural Networks for Times Series Forecasting,” 49th International Symposium ELMAR-2007, Zadar, Croatia, pp. 12-14, 2007.

[7] A.H.Wright, “Genetic Algorithms for Real Parameter Optimization”, Foundations of Genetic Algorithms, Morgan Kaufmann, 2001.

[8] GOLD--Available as on 13th March, 2012: www.goldcopd.org; BODE Index Available as on 13th March, 2012: www.pulmonaryrehab.com

[9] Indian Express News release dated 28th April, 2008, “Tamil Nadu with most number of Elderly living alone”, Available online as on 13th of March, 2012.http://articles.timesofindia.indiatimes.com/ 2008-04-28/chennai/27759889_1_elderly-women-tamil-nadu-elderly-men

[10] Survey for prevalence of COPD Available as on 13th March, 2012, www.clinicaltrials.gov

[11] PubMed--Available as on 13th March, 2012: www.ncbi.nim.nih.gov/pubmed/14999112.

[12] VHA/DOD clinical practice guideline for the management of Chronic obstructive pulmonary disease, prepared by The Chronic Obstructive Pulmonary Disease Workgroup, Version 2.0, 2007 Available online as on 13th March, 2012. www.healthquality.va.gov/copd

2626

Documents

[IEEE 2012 International Conference on Advances in Computing and Communications (ICACC) - Cochin, Kerala, India (2012.08.9-2012.08.11)] 2012 International Conference on Advances in