8
Hindawi Publishing Corporation Journal of Probability and Statistics Volume 2013, Article ID 753930, 7 pages http://dx.doi.org/10.1155/2013/753930 Research Article Nonlinear Survival Regression Using Artificial Neural Network Akbar Biglarian, 1 Enayatollah Bakhshi, 1 Ahmad Reza Baghestani, 2 Mahmood Reza Gohari, 3 Mehdi Rahgozar, 1 and Masoud Karimloo 1 1 Department of Biostatistics, University of Social Welfare and Rehabilitation Sciences (USWRS), Tehran 1985713834, Iran 2 Department of Biostatistics, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran 1971653313, Iran 3 Hospital Management Research Center, Tehran University of Medical Sciences (TUMS), Tehran 1996713883, Iran Correspondence should be addressed to Akbar Biglarian; [email protected] Received 9 May 2012; Revised 21 November 2012; Accepted 23 November 2012 Academic Editor: Shein-chung Chow Copyright © 2013 Akbar Biglarian et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Survival analysis methods deal with a type of data, which is waiting time till occurrence of an event. One common method to analyze this sort of data is Cox regression. Sometimes, the underlying assumptions of the model are not true, such as nonproportionality for the Cox model. In model building, choosing an appropriate model depends on complexity and the characteristics of the data that effect the appropriateness of the model. One strategy, which is used nowadays frequently, is artificial neural network (ANN) model which needs a minimal assumption. is study aimed to compare predictions of the ANN and Cox models by simulated data sets, which the average censoring rate were considered 20% to 80% in both simple and complex model. All simulations and comparisons were performed by R 2.14.1. 1. Introduction Many different parametric, nonparametric, and semipara- metric regression methods are increasingly examined to explore the relationship between a response variable and a set of covariates. e choice of an appropriate method for modeling depends on the methodology of the survey and the nature of the outcome and explanatory variables. A common research question in medical research is to determine whether a set of covariates are correlated with the survival or failure times. Two major characteristics of survival data are censoring and violation of normal assumption for ordinary least squares multiple regressions. ese two char- acteristics of time variable are reasons that straightforward multiple regression techniques cannot be used. Different parametric and semiparametric models in survival regression were introduced which model survival or hazard function. Parametric models, for instance, exponential or weibull, pre- dict survival function while accelerated failure time models are parametric regression methods with logarithm failure time as dependent variable [1, 2]. Choosing an appropriate model for the analysis of the survival data depends on some conditions which are called the underlying assumptions of the model. Sometimes, these assumptions may not be true, for example: (a) lack of inde- pendence between consequent waiting times to occurrence of an event or nonproportionality of hazards in semiparametric models, (b) lack of independency of censoring or the distri- bution of failure times in the case of parametric models [13]. Although, the Cox regression model is an efficient strat- egy in analyzing survival data, but when the assumptions of this model are fail, the free assumption methods could be suitable. Artificial neural network (ANN) models, which are completely nonparametric, have been used increasingly in different areas of sciences. Although analyzing the data using ANN methodology is usually more complex than traditional approaches, ANN models are more flexible and efficient when our main aim is prediction or classification of an outcome using different explanatory variables [417]. Note that when several covariates and complex interac- tions are of concern the best method is ANN; otherwise, based on model assumptions simple regression models can be appropriately used. In this study, simulated data sets with different rates of censoring were used to predict the outcome using ANN and

Research Article Nonlinear Survival Regression Using Artificial Neural …downloads.hindawi.com/journals/jps/2013/753930.pdf · 2019. 7. 31. · regression, is a popular method in

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article Nonlinear Survival Regression Using Artificial Neural …downloads.hindawi.com/journals/jps/2013/753930.pdf · 2019. 7. 31. · regression, is a popular method in

Hindawi Publishing CorporationJournal of Probability and StatisticsVolume 2013 Article ID 753930 7 pageshttpdxdoiorg1011552013753930

Research ArticleNonlinear Survival Regression Using Artificial Neural Network

Akbar Biglarian1 Enayatollah Bakhshi1 Ahmad Reza Baghestani2

Mahmood Reza Gohari3 Mehdi Rahgozar1 and Masoud Karimloo1

1 Department of Biostatistics University of Social Welfare and Rehabilitation Sciences (USWRS) Tehran 1985713834 Iran2Department of Biostatistics Faculty of Paramedical Sciences Shahid Beheshti University of Medical Sciences Tehran 1971653313 Iran3Hospital Management Research Center Tehran University of Medical Sciences (TUMS) Tehran 1996713883 Iran

Correspondence should be addressed to Akbar Biglarian abiglariangmailcom

Received 9 May 2012 Revised 21 November 2012 Accepted 23 November 2012

Academic Editor Shein-chung Chow

Copyright copy 2013 Akbar Biglarian et alThis is an open access article distributed under the Creative CommonsAttribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Survival analysismethods deal with a type of data which is waiting time till occurrence of an event One commonmethod to analyzethis sort of data is Cox regression Sometimes the underlying assumptions of the model are not true such as nonproportionalityfor the Cox model In model building choosing an appropriate model depends on complexity and the characteristics of the datathat effect the appropriateness of the model One strategy which is used nowadays frequently is artificial neural network (ANN)model which needs a minimal assumption This study aimed to compare predictions of the ANN and Cox models by simulateddata sets which the average censoring rate were considered 20 to 80 in both simple and complex model All simulations andcomparisons were performed by R 2141

1 Introduction

Many different parametric nonparametric and semipara-metric regression methods are increasingly examined toexplore the relationship between a response variable and aset of covariates The choice of an appropriate method formodeling depends on the methodology of the survey and thenature of the outcome and explanatory variables

A common research question in medical research is todetermine whether a set of covariates are correlated with thesurvival or failure times Twomajor characteristics of survivaldata are censoring and violation of normal assumption forordinary least squares multiple regressions These two char-acteristics of time variable are reasons that straightforwardmultiple regression techniques cannot be used Differentparametric and semiparametricmodels in survival regressionwere introduced which model survival or hazard functionParametric models for instance exponential or weibull pre-dict survival function while accelerated failure time modelsare parametric regression methods with logarithm failuretime as dependent variable [1 2]

Choosing an appropriate model for the analysis of thesurvival data depends on some conditions which are called

the underlying assumptions of the model Sometimes theseassumptions may not be true for example (a) lack of inde-pendence between consequent waiting times to occurrence ofan event or nonproportionality of hazards in semiparametricmodels (b) lack of independency of censoring or the distri-bution of failure times in the case of parametric models [1ndash3]

Although the Cox regression model is an efficient strat-egy in analyzing survival data but when the assumptions ofthis model are fail the free assumption methods could besuitable

Artificial neural network (ANN) models which arecompletely nonparametric have been used increasingly indifferent areas of sciences Although analyzing the data usingANN methodology is usually more complex than traditionalapproaches ANNmodels aremore flexible and efficientwhenour main aim is prediction or classification of an outcomeusing different explanatory variables [4ndash17]

Note that when several covariates and complex interac-tions are of concern the best method is ANN otherwisebased on model assumptions simple regression models canbe appropriately used

In this study simulated data sets with different rates ofcensoring were used to predict the outcome using ANN and

2 Journal of Probability and Statistics

traditional Cox regression models and then the results ofpredictions were compared

2 Methods

21 Cox RegressionModel Suppose that119879 denotes a continu-ous nonnegative random variable describing the failure timeof an event (ie time-to-event) in a system The probabilitydensity function of 119905 that is the actual survival time is 119891(119905)The survival function 119878(119905) is probability that the failureoccurs later than time 119905 The related hazard function ℎ(119905)denotes the probability density of an event occurring aroundtime 119905 given that it has not occurred prior to time 119905

As we know an inherent characteristic of survival datais censoring Right censored data which is the commonestformof censoring occurswhen survival times are greater thansome defined time point [1 2]The generated data used in thisstudy contains right-censored data

Proportional hazards model which also called Coxregression is a popular method in analysis of survival dataThis model is presented as

ℎ (119905 | 119909) = ℎ0 (119905) exp (1205731015840x) (1)

In this model 120573 is the regression coefficients vector x is avector of covariates and ℎ

0(119905) is the baseline hazard which

is unspecified and a function of time The likelihood andpartial likelihood function for right censored survival data[18] given by (2) and (3)

119871 =

119873

prod

119894=1

119891(119905119894 119909119894)120575119894119878(119905119894 119909119894)1minus120575119894

=

119873

prod

119894=1

(119891 (119905119894 119909119894)

119878 (119905119894 119909119894))

120575119894

119878 (119905119894 119909119894)

=

119873

prod

119894=1

ℎ(119905119894 119909119894)120575119894119878 (119905119894 119909119894)

(2)

119871 =

119863

prod

119894=1

exp (sum119901119896=1120573119896119883(119894)119896)

sum119895isin119877(119905119894)

exp (sum119901119896=1120573119896119883(119894)119896)

(3)

120575119894 is censoring status where 120575

119894= 1 if the observation is com-

plete and 120575119894= 0 if it is censored 119877(119905

119894) is defined as the risk set

at time 119905119894

To fit the Cox regression and estimate 120573 the partiallikelihood in (3) is maximized using iteratively reweightedleast squares to implement the Newton-Raphson methodHowever in the high-dimensional case this approach cannotbe used to estimate 120573 even 120573 is not unique

22 Neural Networks Model An ANN consists of severallayers Layers are interconnected group of artificial neuronsIn addition each layer has a weight that indicates the amountof the effect of neurons on each other Usually an ANNmodel has three layers that called input hidden (middle) andoutput The input layer contains the predictors The hidden

layer contains unobservable nodes and applies a nonlineartransformation to the linear combination of input layer Thevalue of each hidden node is a function of the predictorsTheoutput layer contains the outcome which is some functionsof the hidden units In hidden and output layers the exactform of the function depends on the network type and userdefinition (based on response variable)

There are different methods for learning in the NNFor example in multiple layers perceptron (MLP) whichis the most commonly used the learning performs withminimization of the mean square error of the output and byback-propagation learning algorithm [16 19 20]

In this paper we use the activation transfer function (119892ℎ)

as sigmoid function in hidden and in output layers (119892119900) The

119894th response 119910119894for the predictor values 119867

119894ℎis a nonlinear

function as

119910119894= 119892119900(H1015840119894120573) + 120576

119894

119867119894ℎ= 119892ℎ(X1015840119894120572ℎ)

(4)

Note that X1015840119894is 119894th row of the input data matrix X 119867

119894ℎis a

nonlinear function of linear combination of input data 120573 isthe vector weights of the hidden to the output units and 120572 isthe matrix weights of the input to the hidden units Equation(4) together yield the MLP model

119910119894= 119892119900(H1015840119894120573) + 120576

119894= 119892119900[1205730+

119867minus1

sum

ℎ=1

120573ℎ119892ℎ(X1015840119894120572ℎ)] + 120576

119894

(5)

By the sigmoid activation function (5) can be written asbelow which is a nonlinear regression

119910119894= [1 + exp (minusH1015840

119894120573)]minus1

+ 120576119894

= [1 + exp(minus1205730minus

119867minus1

sum

ℎ=1

120573ℎ[1 + exp (minusX1015840

119894120572119895)]minus1

)]

minus1

+ 120576119894

= 119892 (X11989412057312057211205722 120572

119867) + 120576119894

(6)

where 12057312057211205722 120572

119867are unknown parameter vectors X

119894

is a vector of known constants and 120576119894are residuals The

parameters (weights) can be estimated by optimizing somecriterion function such as maximizing the log-likelihoodfunction or minimizing the sum of squared errors

In an MLP framework a serious problem is overfittingTo control of the overfitting usually a penalty term is addedto the optimization criterion To this penalized least squarescriterion for parameter estimation is given by [1]

119864lowast=

119899

sum

119894=1

(119910119894minus 119910119894)2+ 119901120582(12057312057211205722 120572

119867)

=

119899

sum

119894=1

(119910119894minus 119892 (X

11989412057312057211205722 120572

119867))2

+ 119901120582(12057312057211205722 120572

119867)

(7)

Journal of Probability and Statistics 3

where the penalty term is 119901120582(12057312057211205722 120572

119867) = 120582(sum120573

2

119894+

sum1205722

119894119895)In likelihood schema which is often used in shrinkage

method an adaption of (7) is [8 21]

119871 = minus log likelihood + 120582 (sum1205732

119894+sum120572

2

119894119895) (8)

The penalty weight 120582 regulates between over- and underfit-ting A best value of 120582 is between 0001 and 01 and is chosenby cross-validation [1 8] In this paper we use (8) to get theparameter estimated It is mentioned that for an outcome(the response variable) with two classes 120575 = (01) 119901

119894is

probability of event for the 119894th patient and the error functionprovides the cross-entropy error function as

119864 = minussum120575119894log (119901

119894) + (1 minus 120575

119894) log (1 minus 119901

119894) (9)

An ANN can be modeled as a generalized linear modelingwith nonlinear predictors [8ndash11] Bignazoli et al [8] intro-duced a method called partial logistic ANN and Lisboa [22]developed it with fit smooth estimates of the discrete timehazard in structure It is similar to MLP [23] with additionalcovariate namely time as an input and given by

ℎ119901(119909119901 119905119896)

1 minus ℎ119901(119909119901 119905119896)

= exp(119873ℎ

sum

ℎ=1

119908ℎ119892(

119873119894

sum

119894=1

119908119894ℎ119909119901119894+ 119908119896119905119896+ 119887ℎ) + 119887)

(10)

where 119873119894and 119873

ℎdenote the number of the input and the

hidden nodes respectively 119887ℎand 119887 denote bias term in the

hidden and output layers respectively After the estimationof the network weights w a single output node estimatesconditional failure probability values from the connectionswith themiddle units and the survivorship is calculated fromthe estimated discrete time hazard by multiplying the condi-tionals for survival over time interval Then minuslog(likelihood)statistics could be obtained as [8]

119864 = minus

no of patient

sum

119901=1

1199051

sum

119896=1

[120575119901119896log (ℎ

119901(119909119901 119905119896))

+ (1 minus 120575119901119896) log (1 minus ℎ

119901(119909119901 119905119896))]

(11)

where 120575 is the censoring indicator function

23 Model Fitting The ultimate goal of the learning processis to minimize the error by net In training step to fit themodel by a fixed number of hidden nodes we use penalizedlikelihood as

119864lowast= 119864 + 120582 (sum120573

2

119894+sum120572

2

119894119895) (12)

By using this we improve the convergence of the optimiza-tion and also control overfitting problem [1 8 9 16 21]

To identify the number of the hidden nodes and thenmodel selection Bayesian Information Criterion (BIC) andNetwork Information Criterion (NIC) [8 23 24] that isgeneralization of Akaike Information Criterion (AIC) arecalculated

BIC = minus2 times log likelihood + log (119873) times 119875

NIC = 2119864lowast + 2119875(13)

where 119875 is the number of the parameters estimated and119873 isthe number of observations in training set The best modelis with the smallest value of these criterions In additionto assess prediction accuracy in validation (testing) groupwe calculated classification accuracy and mean square error(MSE)

The best model is selected with the smallest value ofMSEThe models considered 2 3 4 5 10 15 and 20 hidden nodesTheweight decay was considered 0012 which is chosen basedon some empirical study [25]

At finally in order to comparison of the Cox and ANNpredictions classification accuracy and concordance indexeswere calculated All simulations and comparisons were per-formed by R 2141

3 Simulation

In order to compare the accuracy of the predictions by ANNand Cox regression four different simulation schemes basedon Monte Carlo simulation were used In each schemahazard at any time 119905 was considered as exponential form[26] namely 120582 (Table 1) For each schema 1000 independentrandomobservations were generated and thenwith the basedon the relationship between exponential parameter and inde-pendent variables survival times were generated Afterwardthe survival times were transformed as right censored In thiscontext if generated time 119905

119894is greater than the quantile of

exponential function with parameter of 120582 it is considered ascensorshipThis process was repeated 100 times To access theaccuracy of predictions each sample is randomly divided totwo parts The first part the training group was consistingof 700 observations and the 300 remainder observationswere allocated to second group that is the testing groupFurthermore in all simulation the average rates of censorshipwere considered equal to 20 30 40 50 60 70 and80 In addition the models were considered with the maineffects and withoutwith any interaction terms as simple andcomplex model respectively

In simulation 1 and 2 two covariates were used whichwas generated randomly from binomial and standard normaldistributions The models of these simulations were consist-ing of any and one interaction terms In simulation 3 threecovariates were used which was generated randomly frombinomial and standardnormal distributions respectivelyThemodel of this simulation has had two interaction terms Insimulation 4 four covariates were used which was generatedrandomly from binomial and standard normal distributionsThe models of these simulations are complex and consist oftwo- three- and four-interaction terms (Table 1)

4 Journal of Probability and Statistics

Table1Describingof

mod

elslowast

andits

censoringstatus

Mod

elCensorin

gsta

tus

Mod

el(I)120582=exp(1198831+021198832)1198831simBe

r(05)1198832sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

Mod

el(II)120582=exp(1198831+021198832+0211988311198832)1198831simBe

r(05)1198832sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

Mod

el(III)120582=exp(1198831+1198832+051198833+0511988311198831+02511988311198833+02511988311198833)

1198831simBe

r(025)1198832simBe

r(05)1198833sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

Mod

el(IV)

120582=exp[2(1198831+1198832)+1(1198833)+05(1198834)+10(11988311198832+11988311198833+11988311198834)

+05(11988321198833+11988321198834+11988331198834)+025(119883111988321198833+119883111988321198834+119883111988331198834+119883211988331198834)

+05(1198831119883211988331198834)]1198831simBe

r(025)1198832simBe

r(05)1198833sim119873(01)1198834sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

lowast

Theh

azardatanytim

e119905was

considered

asexpo

nentialformand

then

thes

urvivaltim

ewas

generated

Journal of Probability and Statistics 5

Table2Re

sults

ofANNmod

elsfor

simulated

data

of119867

MLP

(119867=2)

MLP

(119867=3)

MLP

(119867=4)

MLP

(119867=5)

MLP

(119867=10)

MLP

(119867=15)

MLP

(119867=20)

Mod

ellowast

BIClowastlowast

MSElowastlowastlowast

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

I20

0213

1055

0203

1148

0205

1252

0214

1302

0223

1405

0254

1387

0278

2235

30

0213

1145

0212

1149

0210

1241

0211

1249

0219

1411

0239

1422

0269

2222

40

0212

1134

0217

1163

0209

1321

0209

1392

0228

1433

0249

1473

0283

2314

50

0227

1243

0209

1187

0234

1449

0203

1218

0231

1429

0251

1521

0279

2543

60

0256

1289

0217

1251

0239

1443

0211

1325

0241

1511

0262

1611

0281

246

870

0281

1259

0226

1263

0259

1398

0232

1281

0224

1534

0273

1597

0280

2413

80

0278

1337

0232

1258

0266

1527

0223

1216

0249

1617

0261

2018

0266

2731

II 20

0229

1255

0223

1248

0235

1452

0234

1402

0323

2005

0264

1987

0269

2315

30

0254

1431

0282

1428

0235

1665

0243

1723

0259

1835

0263

2650

0276

2123

40

0231

1233

0218

1219

0221

1332

0220

1381

0249

1531

0257

1583

0291

2411

50

0237

1342

0222

1317

0241

1539

0213

1318

0251

1689

0266

1591

0280

2561

60

0256

1381

0245

1359

0248

1471

0232

1425

0255

1533

0282

1635

0279

2471

70

0281

1259

0236

1263

0263

1411

0223

1279

0263

1576

0291

1608

0291

240

980

0278

1337

0255

1358

0287

1597

0269

1221

0247

1523

0287

2107

0273

2729

III 20

0283

1261

0235

1274

0250

1408

0241

1299

0240

1539

0277

1607

0291

2433

30

0269

1344

0241

1262

0260

1532

0249

1284

0252

1623

0258

2208

0269

2751

40

0367

1204

0259

1888

0202

1883

0202

2329

0247

2428

0222

2222

0264

1895

50

0241

1358

0232

1321

0252

1541

0220

1327

0263

1699

0277

1603

0288

2593

60

0261

1421

0255

1362

0259

1485

0244

1432

0274

1543

0295

1688

0280

2528

70

0289

1371

0267

1274

0271

1422

0253

1269

0292

1552

0286

1662

0303

2687

80

0286

1345

0273

1366

0281

1622

0265

1482

0295

1538

0302

2134

0333

2869

IV 20

0243

1388

0269

1422

0239

1365

0241

1350

0270

1708

0288

1694

0293

2585

30

0267

1435

0259

1471

0230

1488

0237

1499

0280

1638

0285

1679

0311

2569

40

0290

1387

0273

1365

0282

1433

0268

1328

0288

1595

0296

1675

0298

2701

50

0292

1667

0281

1621

0283

1560

0271

1634

0279

1807

0288

1632

0302

2325

60

0298

1832

0204

1834

0199

2386

0161

2912

0158

2126

0159

2325

0173

2667

70

0238

2181

0239

2174

0247

2392

0237

2403

0253

2359

0233

2249

0256

2411

80

0300

2087

0301

1902

0321

2099

0292

2077

0295

2364

0289

2630

0322

2716

lowast Four

mod

elswith

different

rateso

fcensorin

glowastlowastAv

erageo

fBIC

forlearningset(700casesw

ith100replications)

lowastlowastlowastAv

erageo

fMSE

fortestin

gset(300casesw

ith100replications)

6 Journal of Probability and Statistics

Table 3 Results of concordance indexes of simulation study intesting subset (300 cases with 100 replications)

Modellowast ANN Cox RegI

20 0821 plusmn 0028 0820 plusmn 0031

30 0819 plusmn 0027 0820 plusmn 0030

40 0824 plusmn 0028 0823 plusmn 0030

50 0823 plusmn 0027 0823 plusmn 0029

60 0823 plusmn 0027 0822 plusmn 0030

70 0825 plusmn 0027 0822 plusmn 0032

80 0826 plusmn 0030 0823 plusmn 0031

II20 0818 plusmn 0029 0817 plusmn 0031

30 0817 plusmn 0034 0815 plusmn 0046

40 0818 plusmn 0032 0818 plusmn 0031

50 0816 plusmn 0035 0814 plusmn 0032

60 0816 plusmn 0031 0813 plusmn 0032

70 0814 plusmn 0033 0811 plusmn 0031

80 0812 plusmn 0034 0808 plusmn 0036

III20 0808 plusmn 0030 0798 plusmn 0035

30 0799 plusmn 0033 0790 plusmn 0033

40 0812 plusmn 0043 0795 plusmn 0038

50 0805 plusmn 0028 0795 plusmn 0030

60 0804 plusmn 0028 0790 plusmn 0033

70 0805 plusmn 0028 0791 plusmn 0034

80 0803 plusmn 0028 0790 plusmn 0032

IV20 0764 plusmn 0033 0759 plusmn 0036

30 0773 plusmn 0029 0762 plusmn 0033

40 0777 plusmn 0030 0759 plusmn 0035

50 0779 plusmn 0035 0764 plusmn 0034

60 0812 plusmn 0039 0790 plusmn 0036

70 0764 plusmn 0036 0744 plusmn 0035

80 0766 plusmn 0035 0741 plusmn 0037lowast

Four models with different rates of censoring

The model selection is based on BIC for learning set andSSE criterion for the testing subset data as a verification Theresults in Table 2 show that the simple model performs withless hidden node but complex model performs better withmore hidden nodes The MSE values confirm these results(Table 2)

In the next step to compare of ANN and Cox regressionpredictions concordance indexes were calculated from clas-sification accuracy table in testing subset Concordance indexwas reported as a generalization of the area under receiveroperating characteristic curve for censored data [27 28]Thisindexmeans that the proportion of the cases that are classifiedcorrectly in noncensored (event) and censored groups and 0to 1 values indicated as the ability of the models accuracyTheconcordance index of ANN and Cox regression models was

reported in Table 3The results of simulation study in simplermodel showed that there was not any different betweenthe predictions of Cox regression and NN models But NNpredictions were better than Cox regression predictions incomplex model with high rates of censoring

4 Conclusion

In this paper we presented two approaches for modeling ofsurvival data with different degrees of censoring Cox regres-sion and neural network models A Monte-Carlo simulationstudy was performed to compare predictive accuracy of Coxand neural network models in simulation data sets

In the simulation study four different models wereconsideredThe rate of censorship in each of thesemodelswasconsidered from 20 up 80These models were consideredwith the main effects and also with the interaction termsThen the ability of these models in prediction was evaluatedAs was seen in simple models and with less censored casesthere was little difference in ANN and Cox regressionmodelspredictions It seems that for simpler models the levels ofcensorship have no effect on predictions but the predictionsin more complex models depend on the levels of censorshipThe results showed that the NN model for more complexmodels was provided better predictions But for simplermodels predictions there was not any different in resultsThis result was consistent with the finding fromXiangrsquos study[26] Therefore NN model is proposed in two cases of (1)occurrence of high censorship (ie censoring rate of 60 andhigher) andor (2) in the complex models (ie with manycovariates and any interaction terms) This is a very goodresult and can be used in practical issues which often arefaced onwithmany numbers of variables and alsomany casesof censorship For that reason in these two cases the ANNstrategy can be used as an alternate of traditional Cox modelFinally it is mentioned that there are some flexible alternativemethods such as piecewise exponential and grouped timemodels which can be used for survival data and then its abilitycompared with ANNmodel

Acknowledgment

The authors wish to express their special thanks to refereesfor their valuable comments

References

[1] E T Lee and J W Wang Statistical Methods for SurvivalData Analysis Wiley Series in Probability and Statistics Wiley-Interscience Hoboken NJ USA 3rd edition 2003

[2] V Lagani and I Tsamardinos ldquoStructure-based variable selec-tion for survival datardquo Bioinformatics vol 26 no 15 pp 1887ndash1894 2010

[3] M H Kutner C J Nachtsheim and J Neter Applied LinearRegression Models McGraw-HillIrwin New York NY USA4th edition 2004

[4] WG Baxt and J Skora ldquoProspective validation of artificial neu-ral networks trained to identify acute myocardial infarctionrdquoLancet vol 347 pp 12ndash15 1996

Journal of Probability and Statistics 7

[5] B A Mobley E Schecheer and W E Moore ldquoPredictionof coronary artery stenosis by artificial networksrdquo ArtificialIntelligence in Medicine vol 18 pp 187ndash203 2000

[6] D West and V West ldquoModel selection for medical diagnosticdecision support system breast cancer detection caserdquoArtificialIntelligence in Medicine vol 20 pp 183ndash204 2000

[7] F Ambrogi N Lama P Boracchi and E Biganzoli ldquoSelectionof artificial neural network models for survival analysis withgenetic algorithmsrdquo Computational Statistics amp Data Analysisvol 52 no 1 pp 30ndash42 2007

[8] E Biganzoli P Boracchi LMariani andEMarubini ldquoFeed for-ward neural networks for the analysis of censored survival dataa partial logistic regression approachrdquo Statistics inMedicine vol17 pp 1169ndash1186 1998

[9] E Biganzoli P Boracchi and E Marubini ldquoA general frame-work for neural network models on censored survival datardquoNeural Networks vol 15 pp 209ndash218 2002

[10] E M Bignazoli and P Borrachi ldquoThe Partial Logistic ArtificialNeural Network (PLANN) a tool for the flexible modelling ofcensored survival datardquo in Proceedings of the European Confer-ence on EmergentAspects inClinical DataAnalysis (EACDA rsquo05)2005

[11] E M Bignazoli P Borrachi F Amborgini and E MarubinildquoArtificial neural network for the joint modeling of discretecause-specific hazardsrdquo Artificial Intelligence in Medicine vol37 pp 119ndash130 2006

[12] R Bittern A Cuschieri S D Dolgobrodov et al ldquoAn artificialneural network for analysing the survival of patients withcolorectal cancerrdquo in Proceedings of the European Symposium onArtificial Neural Networks (ESANN rsquo05) Bruges Belgium April2005

[13] K U Chen and C J Christian ldquoUsing back-propagation neuralnetwork to forecast the production values of the machineryindustry in Taiwanrdquo Journal of American Academy of BusinessCambridge vol 9 no 1 pp 183ndash190 2006

[14] C L Chia W Nick Street and H W William ldquoApplicationof artificial neural network-based survival analysis on twobreast cancer datasetsrdquo in Proceedings of the American MedicalInformatics Association Annual Symposium (AMIA rsquo07) pp130ndash134 Chicago Ill USA November 2007

[15] A Eleuteri R Tagliaferri and L Milano ldquoA novel neuralnetwork-based survival analysis modelrdquo Neural Networks vol16 pp 855ndash864 2003

[16] B D Ripley and R M Ripley ldquoNeural networks as statisticalmethods in survival analysisrdquo in Clinical Applications of Artifi-cial Neural Networks pp 237ndash255 Cambridge University PressCambridge UK 2001

[17] B Warner and M Manavendra ldquoUnderstanding neural net-works as statistical toolsrdquo Amstat vol 50 no 4 pp 284ndash2931996

[18] J P Klein andM LMoeschberger Survival Analysis Techniquesfor Censored andTruncatedData Springer NewYorkNYUSA2th edition 2003

[19] JW Kay andDM Titterington Statistics andNeural NetworksOxford University Press Oxford UK 1999

[20] E P Goss and G S Vozikis ldquoImproving health care organiza-tional management through neural network learningrdquo HealthCare Management Science vol 5 pp 221ndash227 2002

[21] R M Ripley A L Harris and L Tarassenko ldquoNon-linearsurvival analysis using neural networksrdquo Statistics in Medicinevol 23 pp 825ndash842 2004

[22] P J G Lisboa H Wong P Harris and R Swindell ldquoA Bayesianneural network approach for modeling censored data withan application to prognosis after surgery for breast cancerrdquoArtificial Intelligence in Medicine vol 28 no 1 pp 1ndash25 2003

[23] C M Bishop Neural Networks for Pattern Recognition TheClarendon Press Oxford University Press New York NY USA1995

[24] B D Ripley Pattern Recognition and Neural Networks Cam-bridge University Press Cambridge UK 1996

[25] N Fallah G U Hong KMohammad et al ldquoNonlinear Poissonregression using neural networks a simulation studyrdquo NeuralComputing and Applications vol 18 no 8 pp 939ndash943 2009

[26] A Xiang P Lapuerta A Ryutov et al ldquoComparison of theperformance of neural network methods and Cox regressionfor censored survival datardquo Computational Statistics amp DataAnalysis vol 34 no 2 pp 243ndash257 2000

[27] F E Harrell R M Califf D B Pryor et al ldquoEvaluating theyield of medical testsrdquo The Journal of the American MedicalAssociation vol 247 pp 2543ndash2546 1982

[28] F E Harrell K L Lee R M Califf et al ldquoRegression modelingstrategies for improved prognostic predictionrdquo Statistics inMedicine vol 3 pp 143ndash152 1984

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article Nonlinear Survival Regression Using Artificial Neural …downloads.hindawi.com/journals/jps/2013/753930.pdf · 2019. 7. 31. · regression, is a popular method in

2 Journal of Probability and Statistics

traditional Cox regression models and then the results ofpredictions were compared

2 Methods

21 Cox RegressionModel Suppose that119879 denotes a continu-ous nonnegative random variable describing the failure timeof an event (ie time-to-event) in a system The probabilitydensity function of 119905 that is the actual survival time is 119891(119905)The survival function 119878(119905) is probability that the failureoccurs later than time 119905 The related hazard function ℎ(119905)denotes the probability density of an event occurring aroundtime 119905 given that it has not occurred prior to time 119905

As we know an inherent characteristic of survival datais censoring Right censored data which is the commonestformof censoring occurswhen survival times are greater thansome defined time point [1 2]The generated data used in thisstudy contains right-censored data

Proportional hazards model which also called Coxregression is a popular method in analysis of survival dataThis model is presented as

ℎ (119905 | 119909) = ℎ0 (119905) exp (1205731015840x) (1)

In this model 120573 is the regression coefficients vector x is avector of covariates and ℎ

0(119905) is the baseline hazard which

is unspecified and a function of time The likelihood andpartial likelihood function for right censored survival data[18] given by (2) and (3)

119871 =

119873

prod

119894=1

119891(119905119894 119909119894)120575119894119878(119905119894 119909119894)1minus120575119894

=

119873

prod

119894=1

(119891 (119905119894 119909119894)

119878 (119905119894 119909119894))

120575119894

119878 (119905119894 119909119894)

=

119873

prod

119894=1

ℎ(119905119894 119909119894)120575119894119878 (119905119894 119909119894)

(2)

119871 =

119863

prod

119894=1

exp (sum119901119896=1120573119896119883(119894)119896)

sum119895isin119877(119905119894)

exp (sum119901119896=1120573119896119883(119894)119896)

(3)

120575119894 is censoring status where 120575

119894= 1 if the observation is com-

plete and 120575119894= 0 if it is censored 119877(119905

119894) is defined as the risk set

at time 119905119894

To fit the Cox regression and estimate 120573 the partiallikelihood in (3) is maximized using iteratively reweightedleast squares to implement the Newton-Raphson methodHowever in the high-dimensional case this approach cannotbe used to estimate 120573 even 120573 is not unique

22 Neural Networks Model An ANN consists of severallayers Layers are interconnected group of artificial neuronsIn addition each layer has a weight that indicates the amountof the effect of neurons on each other Usually an ANNmodel has three layers that called input hidden (middle) andoutput The input layer contains the predictors The hidden

layer contains unobservable nodes and applies a nonlineartransformation to the linear combination of input layer Thevalue of each hidden node is a function of the predictorsTheoutput layer contains the outcome which is some functionsof the hidden units In hidden and output layers the exactform of the function depends on the network type and userdefinition (based on response variable)

There are different methods for learning in the NNFor example in multiple layers perceptron (MLP) whichis the most commonly used the learning performs withminimization of the mean square error of the output and byback-propagation learning algorithm [16 19 20]

In this paper we use the activation transfer function (119892ℎ)

as sigmoid function in hidden and in output layers (119892119900) The

119894th response 119910119894for the predictor values 119867

119894ℎis a nonlinear

function as

119910119894= 119892119900(H1015840119894120573) + 120576

119894

119867119894ℎ= 119892ℎ(X1015840119894120572ℎ)

(4)

Note that X1015840119894is 119894th row of the input data matrix X 119867

119894ℎis a

nonlinear function of linear combination of input data 120573 isthe vector weights of the hidden to the output units and 120572 isthe matrix weights of the input to the hidden units Equation(4) together yield the MLP model

119910119894= 119892119900(H1015840119894120573) + 120576

119894= 119892119900[1205730+

119867minus1

sum

ℎ=1

120573ℎ119892ℎ(X1015840119894120572ℎ)] + 120576

119894

(5)

By the sigmoid activation function (5) can be written asbelow which is a nonlinear regression

119910119894= [1 + exp (minusH1015840

119894120573)]minus1

+ 120576119894

= [1 + exp(minus1205730minus

119867minus1

sum

ℎ=1

120573ℎ[1 + exp (minusX1015840

119894120572119895)]minus1

)]

minus1

+ 120576119894

= 119892 (X11989412057312057211205722 120572

119867) + 120576119894

(6)

where 12057312057211205722 120572

119867are unknown parameter vectors X

119894

is a vector of known constants and 120576119894are residuals The

parameters (weights) can be estimated by optimizing somecriterion function such as maximizing the log-likelihoodfunction or minimizing the sum of squared errors

In an MLP framework a serious problem is overfittingTo control of the overfitting usually a penalty term is addedto the optimization criterion To this penalized least squarescriterion for parameter estimation is given by [1]

119864lowast=

119899

sum

119894=1

(119910119894minus 119910119894)2+ 119901120582(12057312057211205722 120572

119867)

=

119899

sum

119894=1

(119910119894minus 119892 (X

11989412057312057211205722 120572

119867))2

+ 119901120582(12057312057211205722 120572

119867)

(7)

Journal of Probability and Statistics 3

where the penalty term is 119901120582(12057312057211205722 120572

119867) = 120582(sum120573

2

119894+

sum1205722

119894119895)In likelihood schema which is often used in shrinkage

method an adaption of (7) is [8 21]

119871 = minus log likelihood + 120582 (sum1205732

119894+sum120572

2

119894119895) (8)

The penalty weight 120582 regulates between over- and underfit-ting A best value of 120582 is between 0001 and 01 and is chosenby cross-validation [1 8] In this paper we use (8) to get theparameter estimated It is mentioned that for an outcome(the response variable) with two classes 120575 = (01) 119901

119894is

probability of event for the 119894th patient and the error functionprovides the cross-entropy error function as

119864 = minussum120575119894log (119901

119894) + (1 minus 120575

119894) log (1 minus 119901

119894) (9)

An ANN can be modeled as a generalized linear modelingwith nonlinear predictors [8ndash11] Bignazoli et al [8] intro-duced a method called partial logistic ANN and Lisboa [22]developed it with fit smooth estimates of the discrete timehazard in structure It is similar to MLP [23] with additionalcovariate namely time as an input and given by

ℎ119901(119909119901 119905119896)

1 minus ℎ119901(119909119901 119905119896)

= exp(119873ℎ

sum

ℎ=1

119908ℎ119892(

119873119894

sum

119894=1

119908119894ℎ119909119901119894+ 119908119896119905119896+ 119887ℎ) + 119887)

(10)

where 119873119894and 119873

ℎdenote the number of the input and the

hidden nodes respectively 119887ℎand 119887 denote bias term in the

hidden and output layers respectively After the estimationof the network weights w a single output node estimatesconditional failure probability values from the connectionswith themiddle units and the survivorship is calculated fromthe estimated discrete time hazard by multiplying the condi-tionals for survival over time interval Then minuslog(likelihood)statistics could be obtained as [8]

119864 = minus

no of patient

sum

119901=1

1199051

sum

119896=1

[120575119901119896log (ℎ

119901(119909119901 119905119896))

+ (1 minus 120575119901119896) log (1 minus ℎ

119901(119909119901 119905119896))]

(11)

where 120575 is the censoring indicator function

23 Model Fitting The ultimate goal of the learning processis to minimize the error by net In training step to fit themodel by a fixed number of hidden nodes we use penalizedlikelihood as

119864lowast= 119864 + 120582 (sum120573

2

119894+sum120572

2

119894119895) (12)

By using this we improve the convergence of the optimiza-tion and also control overfitting problem [1 8 9 16 21]

To identify the number of the hidden nodes and thenmodel selection Bayesian Information Criterion (BIC) andNetwork Information Criterion (NIC) [8 23 24] that isgeneralization of Akaike Information Criterion (AIC) arecalculated

BIC = minus2 times log likelihood + log (119873) times 119875

NIC = 2119864lowast + 2119875(13)

where 119875 is the number of the parameters estimated and119873 isthe number of observations in training set The best modelis with the smallest value of these criterions In additionto assess prediction accuracy in validation (testing) groupwe calculated classification accuracy and mean square error(MSE)

The best model is selected with the smallest value ofMSEThe models considered 2 3 4 5 10 15 and 20 hidden nodesTheweight decay was considered 0012 which is chosen basedon some empirical study [25]

At finally in order to comparison of the Cox and ANNpredictions classification accuracy and concordance indexeswere calculated All simulations and comparisons were per-formed by R 2141

3 Simulation

In order to compare the accuracy of the predictions by ANNand Cox regression four different simulation schemes basedon Monte Carlo simulation were used In each schemahazard at any time 119905 was considered as exponential form[26] namely 120582 (Table 1) For each schema 1000 independentrandomobservations were generated and thenwith the basedon the relationship between exponential parameter and inde-pendent variables survival times were generated Afterwardthe survival times were transformed as right censored In thiscontext if generated time 119905

119894is greater than the quantile of

exponential function with parameter of 120582 it is considered ascensorshipThis process was repeated 100 times To access theaccuracy of predictions each sample is randomly divided totwo parts The first part the training group was consistingof 700 observations and the 300 remainder observationswere allocated to second group that is the testing groupFurthermore in all simulation the average rates of censorshipwere considered equal to 20 30 40 50 60 70 and80 In addition the models were considered with the maineffects and withoutwith any interaction terms as simple andcomplex model respectively

In simulation 1 and 2 two covariates were used whichwas generated randomly from binomial and standard normaldistributions The models of these simulations were consist-ing of any and one interaction terms In simulation 3 threecovariates were used which was generated randomly frombinomial and standardnormal distributions respectivelyThemodel of this simulation has had two interaction terms Insimulation 4 four covariates were used which was generatedrandomly from binomial and standard normal distributionsThe models of these simulations are complex and consist oftwo- three- and four-interaction terms (Table 1)

4 Journal of Probability and Statistics

Table1Describingof

mod

elslowast

andits

censoringstatus

Mod

elCensorin

gsta

tus

Mod

el(I)120582=exp(1198831+021198832)1198831simBe

r(05)1198832sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

Mod

el(II)120582=exp(1198831+021198832+0211988311198832)1198831simBe

r(05)1198832sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

Mod

el(III)120582=exp(1198831+1198832+051198833+0511988311198831+02511988311198833+02511988311198833)

1198831simBe

r(025)1198832simBe

r(05)1198833sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

Mod

el(IV)

120582=exp[2(1198831+1198832)+1(1198833)+05(1198834)+10(11988311198832+11988311198833+11988311198834)

+05(11988321198833+11988321198834+11988331198834)+025(119883111988321198833+119883111988321198834+119883111988331198834+119883211988331198834)

+05(1198831119883211988331198834)]1198831simBe

r(025)1198832simBe

r(05)1198833sim119873(01)1198834sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

lowast

Theh

azardatanytim

e119905was

considered

asexpo

nentialformand

then

thes

urvivaltim

ewas

generated

Journal of Probability and Statistics 5

Table2Re

sults

ofANNmod

elsfor

simulated

data

of119867

MLP

(119867=2)

MLP

(119867=3)

MLP

(119867=4)

MLP

(119867=5)

MLP

(119867=10)

MLP

(119867=15)

MLP

(119867=20)

Mod

ellowast

BIClowastlowast

MSElowastlowastlowast

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

I20

0213

1055

0203

1148

0205

1252

0214

1302

0223

1405

0254

1387

0278

2235

30

0213

1145

0212

1149

0210

1241

0211

1249

0219

1411

0239

1422

0269

2222

40

0212

1134

0217

1163

0209

1321

0209

1392

0228

1433

0249

1473

0283

2314

50

0227

1243

0209

1187

0234

1449

0203

1218

0231

1429

0251

1521

0279

2543

60

0256

1289

0217

1251

0239

1443

0211

1325

0241

1511

0262

1611

0281

246

870

0281

1259

0226

1263

0259

1398

0232

1281

0224

1534

0273

1597

0280

2413

80

0278

1337

0232

1258

0266

1527

0223

1216

0249

1617

0261

2018

0266

2731

II 20

0229

1255

0223

1248

0235

1452

0234

1402

0323

2005

0264

1987

0269

2315

30

0254

1431

0282

1428

0235

1665

0243

1723

0259

1835

0263

2650

0276

2123

40

0231

1233

0218

1219

0221

1332

0220

1381

0249

1531

0257

1583

0291

2411

50

0237

1342

0222

1317

0241

1539

0213

1318

0251

1689

0266

1591

0280

2561

60

0256

1381

0245

1359

0248

1471

0232

1425

0255

1533

0282

1635

0279

2471

70

0281

1259

0236

1263

0263

1411

0223

1279

0263

1576

0291

1608

0291

240

980

0278

1337

0255

1358

0287

1597

0269

1221

0247

1523

0287

2107

0273

2729

III 20

0283

1261

0235

1274

0250

1408

0241

1299

0240

1539

0277

1607

0291

2433

30

0269

1344

0241

1262

0260

1532

0249

1284

0252

1623

0258

2208

0269

2751

40

0367

1204

0259

1888

0202

1883

0202

2329

0247

2428

0222

2222

0264

1895

50

0241

1358

0232

1321

0252

1541

0220

1327

0263

1699

0277

1603

0288

2593

60

0261

1421

0255

1362

0259

1485

0244

1432

0274

1543

0295

1688

0280

2528

70

0289

1371

0267

1274

0271

1422

0253

1269

0292

1552

0286

1662

0303

2687

80

0286

1345

0273

1366

0281

1622

0265

1482

0295

1538

0302

2134

0333

2869

IV 20

0243

1388

0269

1422

0239

1365

0241

1350

0270

1708

0288

1694

0293

2585

30

0267

1435

0259

1471

0230

1488

0237

1499

0280

1638

0285

1679

0311

2569

40

0290

1387

0273

1365

0282

1433

0268

1328

0288

1595

0296

1675

0298

2701

50

0292

1667

0281

1621

0283

1560

0271

1634

0279

1807

0288

1632

0302

2325

60

0298

1832

0204

1834

0199

2386

0161

2912

0158

2126

0159

2325

0173

2667

70

0238

2181

0239

2174

0247

2392

0237

2403

0253

2359

0233

2249

0256

2411

80

0300

2087

0301

1902

0321

2099

0292

2077

0295

2364

0289

2630

0322

2716

lowast Four

mod

elswith

different

rateso

fcensorin

glowastlowastAv

erageo

fBIC

forlearningset(700casesw

ith100replications)

lowastlowastlowastAv

erageo

fMSE

fortestin

gset(300casesw

ith100replications)

6 Journal of Probability and Statistics

Table 3 Results of concordance indexes of simulation study intesting subset (300 cases with 100 replications)

Modellowast ANN Cox RegI

20 0821 plusmn 0028 0820 plusmn 0031

30 0819 plusmn 0027 0820 plusmn 0030

40 0824 plusmn 0028 0823 plusmn 0030

50 0823 plusmn 0027 0823 plusmn 0029

60 0823 plusmn 0027 0822 plusmn 0030

70 0825 plusmn 0027 0822 plusmn 0032

80 0826 plusmn 0030 0823 plusmn 0031

II20 0818 plusmn 0029 0817 plusmn 0031

30 0817 plusmn 0034 0815 plusmn 0046

40 0818 plusmn 0032 0818 plusmn 0031

50 0816 plusmn 0035 0814 plusmn 0032

60 0816 plusmn 0031 0813 plusmn 0032

70 0814 plusmn 0033 0811 plusmn 0031

80 0812 plusmn 0034 0808 plusmn 0036

III20 0808 plusmn 0030 0798 plusmn 0035

30 0799 plusmn 0033 0790 plusmn 0033

40 0812 plusmn 0043 0795 plusmn 0038

50 0805 plusmn 0028 0795 plusmn 0030

60 0804 plusmn 0028 0790 plusmn 0033

70 0805 plusmn 0028 0791 plusmn 0034

80 0803 plusmn 0028 0790 plusmn 0032

IV20 0764 plusmn 0033 0759 plusmn 0036

30 0773 plusmn 0029 0762 plusmn 0033

40 0777 plusmn 0030 0759 plusmn 0035

50 0779 plusmn 0035 0764 plusmn 0034

60 0812 plusmn 0039 0790 plusmn 0036

70 0764 plusmn 0036 0744 plusmn 0035

80 0766 plusmn 0035 0741 plusmn 0037lowast

Four models with different rates of censoring

The model selection is based on BIC for learning set andSSE criterion for the testing subset data as a verification Theresults in Table 2 show that the simple model performs withless hidden node but complex model performs better withmore hidden nodes The MSE values confirm these results(Table 2)

In the next step to compare of ANN and Cox regressionpredictions concordance indexes were calculated from clas-sification accuracy table in testing subset Concordance indexwas reported as a generalization of the area under receiveroperating characteristic curve for censored data [27 28]Thisindexmeans that the proportion of the cases that are classifiedcorrectly in noncensored (event) and censored groups and 0to 1 values indicated as the ability of the models accuracyTheconcordance index of ANN and Cox regression models was

reported in Table 3The results of simulation study in simplermodel showed that there was not any different betweenthe predictions of Cox regression and NN models But NNpredictions were better than Cox regression predictions incomplex model with high rates of censoring

4 Conclusion

In this paper we presented two approaches for modeling ofsurvival data with different degrees of censoring Cox regres-sion and neural network models A Monte-Carlo simulationstudy was performed to compare predictive accuracy of Coxand neural network models in simulation data sets

In the simulation study four different models wereconsideredThe rate of censorship in each of thesemodelswasconsidered from 20 up 80These models were consideredwith the main effects and also with the interaction termsThen the ability of these models in prediction was evaluatedAs was seen in simple models and with less censored casesthere was little difference in ANN and Cox regressionmodelspredictions It seems that for simpler models the levels ofcensorship have no effect on predictions but the predictionsin more complex models depend on the levels of censorshipThe results showed that the NN model for more complexmodels was provided better predictions But for simplermodels predictions there was not any different in resultsThis result was consistent with the finding fromXiangrsquos study[26] Therefore NN model is proposed in two cases of (1)occurrence of high censorship (ie censoring rate of 60 andhigher) andor (2) in the complex models (ie with manycovariates and any interaction terms) This is a very goodresult and can be used in practical issues which often arefaced onwithmany numbers of variables and alsomany casesof censorship For that reason in these two cases the ANNstrategy can be used as an alternate of traditional Cox modelFinally it is mentioned that there are some flexible alternativemethods such as piecewise exponential and grouped timemodels which can be used for survival data and then its abilitycompared with ANNmodel

Acknowledgment

The authors wish to express their special thanks to refereesfor their valuable comments

References

[1] E T Lee and J W Wang Statistical Methods for SurvivalData Analysis Wiley Series in Probability and Statistics Wiley-Interscience Hoboken NJ USA 3rd edition 2003

[2] V Lagani and I Tsamardinos ldquoStructure-based variable selec-tion for survival datardquo Bioinformatics vol 26 no 15 pp 1887ndash1894 2010

[3] M H Kutner C J Nachtsheim and J Neter Applied LinearRegression Models McGraw-HillIrwin New York NY USA4th edition 2004

[4] WG Baxt and J Skora ldquoProspective validation of artificial neu-ral networks trained to identify acute myocardial infarctionrdquoLancet vol 347 pp 12ndash15 1996

Journal of Probability and Statistics 7

[5] B A Mobley E Schecheer and W E Moore ldquoPredictionof coronary artery stenosis by artificial networksrdquo ArtificialIntelligence in Medicine vol 18 pp 187ndash203 2000

[6] D West and V West ldquoModel selection for medical diagnosticdecision support system breast cancer detection caserdquoArtificialIntelligence in Medicine vol 20 pp 183ndash204 2000

[7] F Ambrogi N Lama P Boracchi and E Biganzoli ldquoSelectionof artificial neural network models for survival analysis withgenetic algorithmsrdquo Computational Statistics amp Data Analysisvol 52 no 1 pp 30ndash42 2007

[8] E Biganzoli P Boracchi LMariani andEMarubini ldquoFeed for-ward neural networks for the analysis of censored survival dataa partial logistic regression approachrdquo Statistics inMedicine vol17 pp 1169ndash1186 1998

[9] E Biganzoli P Boracchi and E Marubini ldquoA general frame-work for neural network models on censored survival datardquoNeural Networks vol 15 pp 209ndash218 2002

[10] E M Bignazoli and P Borrachi ldquoThe Partial Logistic ArtificialNeural Network (PLANN) a tool for the flexible modelling ofcensored survival datardquo in Proceedings of the European Confer-ence on EmergentAspects inClinical DataAnalysis (EACDA rsquo05)2005

[11] E M Bignazoli P Borrachi F Amborgini and E MarubinildquoArtificial neural network for the joint modeling of discretecause-specific hazardsrdquo Artificial Intelligence in Medicine vol37 pp 119ndash130 2006

[12] R Bittern A Cuschieri S D Dolgobrodov et al ldquoAn artificialneural network for analysing the survival of patients withcolorectal cancerrdquo in Proceedings of the European Symposium onArtificial Neural Networks (ESANN rsquo05) Bruges Belgium April2005

[13] K U Chen and C J Christian ldquoUsing back-propagation neuralnetwork to forecast the production values of the machineryindustry in Taiwanrdquo Journal of American Academy of BusinessCambridge vol 9 no 1 pp 183ndash190 2006

[14] C L Chia W Nick Street and H W William ldquoApplicationof artificial neural network-based survival analysis on twobreast cancer datasetsrdquo in Proceedings of the American MedicalInformatics Association Annual Symposium (AMIA rsquo07) pp130ndash134 Chicago Ill USA November 2007

[15] A Eleuteri R Tagliaferri and L Milano ldquoA novel neuralnetwork-based survival analysis modelrdquo Neural Networks vol16 pp 855ndash864 2003

[16] B D Ripley and R M Ripley ldquoNeural networks as statisticalmethods in survival analysisrdquo in Clinical Applications of Artifi-cial Neural Networks pp 237ndash255 Cambridge University PressCambridge UK 2001

[17] B Warner and M Manavendra ldquoUnderstanding neural net-works as statistical toolsrdquo Amstat vol 50 no 4 pp 284ndash2931996

[18] J P Klein andM LMoeschberger Survival Analysis Techniquesfor Censored andTruncatedData Springer NewYorkNYUSA2th edition 2003

[19] JW Kay andDM Titterington Statistics andNeural NetworksOxford University Press Oxford UK 1999

[20] E P Goss and G S Vozikis ldquoImproving health care organiza-tional management through neural network learningrdquo HealthCare Management Science vol 5 pp 221ndash227 2002

[21] R M Ripley A L Harris and L Tarassenko ldquoNon-linearsurvival analysis using neural networksrdquo Statistics in Medicinevol 23 pp 825ndash842 2004

[22] P J G Lisboa H Wong P Harris and R Swindell ldquoA Bayesianneural network approach for modeling censored data withan application to prognosis after surgery for breast cancerrdquoArtificial Intelligence in Medicine vol 28 no 1 pp 1ndash25 2003

[23] C M Bishop Neural Networks for Pattern Recognition TheClarendon Press Oxford University Press New York NY USA1995

[24] B D Ripley Pattern Recognition and Neural Networks Cam-bridge University Press Cambridge UK 1996

[25] N Fallah G U Hong KMohammad et al ldquoNonlinear Poissonregression using neural networks a simulation studyrdquo NeuralComputing and Applications vol 18 no 8 pp 939ndash943 2009

[26] A Xiang P Lapuerta A Ryutov et al ldquoComparison of theperformance of neural network methods and Cox regressionfor censored survival datardquo Computational Statistics amp DataAnalysis vol 34 no 2 pp 243ndash257 2000

[27] F E Harrell R M Califf D B Pryor et al ldquoEvaluating theyield of medical testsrdquo The Journal of the American MedicalAssociation vol 247 pp 2543ndash2546 1982

[28] F E Harrell K L Lee R M Califf et al ldquoRegression modelingstrategies for improved prognostic predictionrdquo Statistics inMedicine vol 3 pp 143ndash152 1984

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article Nonlinear Survival Regression Using Artificial Neural …downloads.hindawi.com/journals/jps/2013/753930.pdf · 2019. 7. 31. · regression, is a popular method in

Journal of Probability and Statistics 3

where the penalty term is 119901120582(12057312057211205722 120572

119867) = 120582(sum120573

2

119894+

sum1205722

119894119895)In likelihood schema which is often used in shrinkage

method an adaption of (7) is [8 21]

119871 = minus log likelihood + 120582 (sum1205732

119894+sum120572

2

119894119895) (8)

The penalty weight 120582 regulates between over- and underfit-ting A best value of 120582 is between 0001 and 01 and is chosenby cross-validation [1 8] In this paper we use (8) to get theparameter estimated It is mentioned that for an outcome(the response variable) with two classes 120575 = (01) 119901

119894is

probability of event for the 119894th patient and the error functionprovides the cross-entropy error function as

119864 = minussum120575119894log (119901

119894) + (1 minus 120575

119894) log (1 minus 119901

119894) (9)

An ANN can be modeled as a generalized linear modelingwith nonlinear predictors [8ndash11] Bignazoli et al [8] intro-duced a method called partial logistic ANN and Lisboa [22]developed it with fit smooth estimates of the discrete timehazard in structure It is similar to MLP [23] with additionalcovariate namely time as an input and given by

ℎ119901(119909119901 119905119896)

1 minus ℎ119901(119909119901 119905119896)

= exp(119873ℎ

sum

ℎ=1

119908ℎ119892(

119873119894

sum

119894=1

119908119894ℎ119909119901119894+ 119908119896119905119896+ 119887ℎ) + 119887)

(10)

where 119873119894and 119873

ℎdenote the number of the input and the

hidden nodes respectively 119887ℎand 119887 denote bias term in the

hidden and output layers respectively After the estimationof the network weights w a single output node estimatesconditional failure probability values from the connectionswith themiddle units and the survivorship is calculated fromthe estimated discrete time hazard by multiplying the condi-tionals for survival over time interval Then minuslog(likelihood)statistics could be obtained as [8]

119864 = minus

no of patient

sum

119901=1

1199051

sum

119896=1

[120575119901119896log (ℎ

119901(119909119901 119905119896))

+ (1 minus 120575119901119896) log (1 minus ℎ

119901(119909119901 119905119896))]

(11)

where 120575 is the censoring indicator function

23 Model Fitting The ultimate goal of the learning processis to minimize the error by net In training step to fit themodel by a fixed number of hidden nodes we use penalizedlikelihood as

119864lowast= 119864 + 120582 (sum120573

2

119894+sum120572

2

119894119895) (12)

By using this we improve the convergence of the optimiza-tion and also control overfitting problem [1 8 9 16 21]

To identify the number of the hidden nodes and thenmodel selection Bayesian Information Criterion (BIC) andNetwork Information Criterion (NIC) [8 23 24] that isgeneralization of Akaike Information Criterion (AIC) arecalculated

BIC = minus2 times log likelihood + log (119873) times 119875

NIC = 2119864lowast + 2119875(13)

where 119875 is the number of the parameters estimated and119873 isthe number of observations in training set The best modelis with the smallest value of these criterions In additionto assess prediction accuracy in validation (testing) groupwe calculated classification accuracy and mean square error(MSE)

The best model is selected with the smallest value ofMSEThe models considered 2 3 4 5 10 15 and 20 hidden nodesTheweight decay was considered 0012 which is chosen basedon some empirical study [25]

At finally in order to comparison of the Cox and ANNpredictions classification accuracy and concordance indexeswere calculated All simulations and comparisons were per-formed by R 2141

3 Simulation

In order to compare the accuracy of the predictions by ANNand Cox regression four different simulation schemes basedon Monte Carlo simulation were used In each schemahazard at any time 119905 was considered as exponential form[26] namely 120582 (Table 1) For each schema 1000 independentrandomobservations were generated and thenwith the basedon the relationship between exponential parameter and inde-pendent variables survival times were generated Afterwardthe survival times were transformed as right censored In thiscontext if generated time 119905

119894is greater than the quantile of

exponential function with parameter of 120582 it is considered ascensorshipThis process was repeated 100 times To access theaccuracy of predictions each sample is randomly divided totwo parts The first part the training group was consistingof 700 observations and the 300 remainder observationswere allocated to second group that is the testing groupFurthermore in all simulation the average rates of censorshipwere considered equal to 20 30 40 50 60 70 and80 In addition the models were considered with the maineffects and withoutwith any interaction terms as simple andcomplex model respectively

In simulation 1 and 2 two covariates were used whichwas generated randomly from binomial and standard normaldistributions The models of these simulations were consist-ing of any and one interaction terms In simulation 3 threecovariates were used which was generated randomly frombinomial and standardnormal distributions respectivelyThemodel of this simulation has had two interaction terms Insimulation 4 four covariates were used which was generatedrandomly from binomial and standard normal distributionsThe models of these simulations are complex and consist oftwo- three- and four-interaction terms (Table 1)

4 Journal of Probability and Statistics

Table1Describingof

mod

elslowast

andits

censoringstatus

Mod

elCensorin

gsta

tus

Mod

el(I)120582=exp(1198831+021198832)1198831simBe

r(05)1198832sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

Mod

el(II)120582=exp(1198831+021198832+0211988311198832)1198831simBe

r(05)1198832sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

Mod

el(III)120582=exp(1198831+1198832+051198833+0511988311198831+02511988311198833+02511988311198833)

1198831simBe

r(025)1198832simBe

r(05)1198833sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

Mod

el(IV)

120582=exp[2(1198831+1198832)+1(1198833)+05(1198834)+10(11988311198832+11988311198833+11988311198834)

+05(11988321198833+11988321198834+11988331198834)+025(119883111988321198833+119883111988321198834+119883111988331198834+119883211988331198834)

+05(1198831119883211988331198834)]1198831simBe

r(025)1198832simBe

r(05)1198833sim119873(01)1198834sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

lowast

Theh

azardatanytim

e119905was

considered

asexpo

nentialformand

then

thes

urvivaltim

ewas

generated

Journal of Probability and Statistics 5

Table2Re

sults

ofANNmod

elsfor

simulated

data

of119867

MLP

(119867=2)

MLP

(119867=3)

MLP

(119867=4)

MLP

(119867=5)

MLP

(119867=10)

MLP

(119867=15)

MLP

(119867=20)

Mod

ellowast

BIClowastlowast

MSElowastlowastlowast

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

I20

0213

1055

0203

1148

0205

1252

0214

1302

0223

1405

0254

1387

0278

2235

30

0213

1145

0212

1149

0210

1241

0211

1249

0219

1411

0239

1422

0269

2222

40

0212

1134

0217

1163

0209

1321

0209

1392

0228

1433

0249

1473

0283

2314

50

0227

1243

0209

1187

0234

1449

0203

1218

0231

1429

0251

1521

0279

2543

60

0256

1289

0217

1251

0239

1443

0211

1325

0241

1511

0262

1611

0281

246

870

0281

1259

0226

1263

0259

1398

0232

1281

0224

1534

0273

1597

0280

2413

80

0278

1337

0232

1258

0266

1527

0223

1216

0249

1617

0261

2018

0266

2731

II 20

0229

1255

0223

1248

0235

1452

0234

1402

0323

2005

0264

1987

0269

2315

30

0254

1431

0282

1428

0235

1665

0243

1723

0259

1835

0263

2650

0276

2123

40

0231

1233

0218

1219

0221

1332

0220

1381

0249

1531

0257

1583

0291

2411

50

0237

1342

0222

1317

0241

1539

0213

1318

0251

1689

0266

1591

0280

2561

60

0256

1381

0245

1359

0248

1471

0232

1425

0255

1533

0282

1635

0279

2471

70

0281

1259

0236

1263

0263

1411

0223

1279

0263

1576

0291

1608

0291

240

980

0278

1337

0255

1358

0287

1597

0269

1221

0247

1523

0287

2107

0273

2729

III 20

0283

1261

0235

1274

0250

1408

0241

1299

0240

1539

0277

1607

0291

2433

30

0269

1344

0241

1262

0260

1532

0249

1284

0252

1623

0258

2208

0269

2751

40

0367

1204

0259

1888

0202

1883

0202

2329

0247

2428

0222

2222

0264

1895

50

0241

1358

0232

1321

0252

1541

0220

1327

0263

1699

0277

1603

0288

2593

60

0261

1421

0255

1362

0259

1485

0244

1432

0274

1543

0295

1688

0280

2528

70

0289

1371

0267

1274

0271

1422

0253

1269

0292

1552

0286

1662

0303

2687

80

0286

1345

0273

1366

0281

1622

0265

1482

0295

1538

0302

2134

0333

2869

IV 20

0243

1388

0269

1422

0239

1365

0241

1350

0270

1708

0288

1694

0293

2585

30

0267

1435

0259

1471

0230

1488

0237

1499

0280

1638

0285

1679

0311

2569

40

0290

1387

0273

1365

0282

1433

0268

1328

0288

1595

0296

1675

0298

2701

50

0292

1667

0281

1621

0283

1560

0271

1634

0279

1807

0288

1632

0302

2325

60

0298

1832

0204

1834

0199

2386

0161

2912

0158

2126

0159

2325

0173

2667

70

0238

2181

0239

2174

0247

2392

0237

2403

0253

2359

0233

2249

0256

2411

80

0300

2087

0301

1902

0321

2099

0292

2077

0295

2364

0289

2630

0322

2716

lowast Four

mod

elswith

different

rateso

fcensorin

glowastlowastAv

erageo

fBIC

forlearningset(700casesw

ith100replications)

lowastlowastlowastAv

erageo

fMSE

fortestin

gset(300casesw

ith100replications)

6 Journal of Probability and Statistics

Table 3 Results of concordance indexes of simulation study intesting subset (300 cases with 100 replications)

Modellowast ANN Cox RegI

20 0821 plusmn 0028 0820 plusmn 0031

30 0819 plusmn 0027 0820 plusmn 0030

40 0824 plusmn 0028 0823 plusmn 0030

50 0823 plusmn 0027 0823 plusmn 0029

60 0823 plusmn 0027 0822 plusmn 0030

70 0825 plusmn 0027 0822 plusmn 0032

80 0826 plusmn 0030 0823 plusmn 0031

II20 0818 plusmn 0029 0817 plusmn 0031

30 0817 plusmn 0034 0815 plusmn 0046

40 0818 plusmn 0032 0818 plusmn 0031

50 0816 plusmn 0035 0814 plusmn 0032

60 0816 plusmn 0031 0813 plusmn 0032

70 0814 plusmn 0033 0811 plusmn 0031

80 0812 plusmn 0034 0808 plusmn 0036

III20 0808 plusmn 0030 0798 plusmn 0035

30 0799 plusmn 0033 0790 plusmn 0033

40 0812 plusmn 0043 0795 plusmn 0038

50 0805 plusmn 0028 0795 plusmn 0030

60 0804 plusmn 0028 0790 plusmn 0033

70 0805 plusmn 0028 0791 plusmn 0034

80 0803 plusmn 0028 0790 plusmn 0032

IV20 0764 plusmn 0033 0759 plusmn 0036

30 0773 plusmn 0029 0762 plusmn 0033

40 0777 plusmn 0030 0759 plusmn 0035

50 0779 plusmn 0035 0764 plusmn 0034

60 0812 plusmn 0039 0790 plusmn 0036

70 0764 plusmn 0036 0744 plusmn 0035

80 0766 plusmn 0035 0741 plusmn 0037lowast

Four models with different rates of censoring

The model selection is based on BIC for learning set andSSE criterion for the testing subset data as a verification Theresults in Table 2 show that the simple model performs withless hidden node but complex model performs better withmore hidden nodes The MSE values confirm these results(Table 2)

In the next step to compare of ANN and Cox regressionpredictions concordance indexes were calculated from clas-sification accuracy table in testing subset Concordance indexwas reported as a generalization of the area under receiveroperating characteristic curve for censored data [27 28]Thisindexmeans that the proportion of the cases that are classifiedcorrectly in noncensored (event) and censored groups and 0to 1 values indicated as the ability of the models accuracyTheconcordance index of ANN and Cox regression models was

reported in Table 3The results of simulation study in simplermodel showed that there was not any different betweenthe predictions of Cox regression and NN models But NNpredictions were better than Cox regression predictions incomplex model with high rates of censoring

4 Conclusion

In this paper we presented two approaches for modeling ofsurvival data with different degrees of censoring Cox regres-sion and neural network models A Monte-Carlo simulationstudy was performed to compare predictive accuracy of Coxand neural network models in simulation data sets

In the simulation study four different models wereconsideredThe rate of censorship in each of thesemodelswasconsidered from 20 up 80These models were consideredwith the main effects and also with the interaction termsThen the ability of these models in prediction was evaluatedAs was seen in simple models and with less censored casesthere was little difference in ANN and Cox regressionmodelspredictions It seems that for simpler models the levels ofcensorship have no effect on predictions but the predictionsin more complex models depend on the levels of censorshipThe results showed that the NN model for more complexmodels was provided better predictions But for simplermodels predictions there was not any different in resultsThis result was consistent with the finding fromXiangrsquos study[26] Therefore NN model is proposed in two cases of (1)occurrence of high censorship (ie censoring rate of 60 andhigher) andor (2) in the complex models (ie with manycovariates and any interaction terms) This is a very goodresult and can be used in practical issues which often arefaced onwithmany numbers of variables and alsomany casesof censorship For that reason in these two cases the ANNstrategy can be used as an alternate of traditional Cox modelFinally it is mentioned that there are some flexible alternativemethods such as piecewise exponential and grouped timemodels which can be used for survival data and then its abilitycompared with ANNmodel

Acknowledgment

The authors wish to express their special thanks to refereesfor their valuable comments

References

[1] E T Lee and J W Wang Statistical Methods for SurvivalData Analysis Wiley Series in Probability and Statistics Wiley-Interscience Hoboken NJ USA 3rd edition 2003

[2] V Lagani and I Tsamardinos ldquoStructure-based variable selec-tion for survival datardquo Bioinformatics vol 26 no 15 pp 1887ndash1894 2010

[3] M H Kutner C J Nachtsheim and J Neter Applied LinearRegression Models McGraw-HillIrwin New York NY USA4th edition 2004

[4] WG Baxt and J Skora ldquoProspective validation of artificial neu-ral networks trained to identify acute myocardial infarctionrdquoLancet vol 347 pp 12ndash15 1996

Journal of Probability and Statistics 7

[5] B A Mobley E Schecheer and W E Moore ldquoPredictionof coronary artery stenosis by artificial networksrdquo ArtificialIntelligence in Medicine vol 18 pp 187ndash203 2000

[6] D West and V West ldquoModel selection for medical diagnosticdecision support system breast cancer detection caserdquoArtificialIntelligence in Medicine vol 20 pp 183ndash204 2000

[7] F Ambrogi N Lama P Boracchi and E Biganzoli ldquoSelectionof artificial neural network models for survival analysis withgenetic algorithmsrdquo Computational Statistics amp Data Analysisvol 52 no 1 pp 30ndash42 2007

[8] E Biganzoli P Boracchi LMariani andEMarubini ldquoFeed for-ward neural networks for the analysis of censored survival dataa partial logistic regression approachrdquo Statistics inMedicine vol17 pp 1169ndash1186 1998

[9] E Biganzoli P Boracchi and E Marubini ldquoA general frame-work for neural network models on censored survival datardquoNeural Networks vol 15 pp 209ndash218 2002

[10] E M Bignazoli and P Borrachi ldquoThe Partial Logistic ArtificialNeural Network (PLANN) a tool for the flexible modelling ofcensored survival datardquo in Proceedings of the European Confer-ence on EmergentAspects inClinical DataAnalysis (EACDA rsquo05)2005

[11] E M Bignazoli P Borrachi F Amborgini and E MarubinildquoArtificial neural network for the joint modeling of discretecause-specific hazardsrdquo Artificial Intelligence in Medicine vol37 pp 119ndash130 2006

[12] R Bittern A Cuschieri S D Dolgobrodov et al ldquoAn artificialneural network for analysing the survival of patients withcolorectal cancerrdquo in Proceedings of the European Symposium onArtificial Neural Networks (ESANN rsquo05) Bruges Belgium April2005

[13] K U Chen and C J Christian ldquoUsing back-propagation neuralnetwork to forecast the production values of the machineryindustry in Taiwanrdquo Journal of American Academy of BusinessCambridge vol 9 no 1 pp 183ndash190 2006

[14] C L Chia W Nick Street and H W William ldquoApplicationof artificial neural network-based survival analysis on twobreast cancer datasetsrdquo in Proceedings of the American MedicalInformatics Association Annual Symposium (AMIA rsquo07) pp130ndash134 Chicago Ill USA November 2007

[15] A Eleuteri R Tagliaferri and L Milano ldquoA novel neuralnetwork-based survival analysis modelrdquo Neural Networks vol16 pp 855ndash864 2003

[16] B D Ripley and R M Ripley ldquoNeural networks as statisticalmethods in survival analysisrdquo in Clinical Applications of Artifi-cial Neural Networks pp 237ndash255 Cambridge University PressCambridge UK 2001

[17] B Warner and M Manavendra ldquoUnderstanding neural net-works as statistical toolsrdquo Amstat vol 50 no 4 pp 284ndash2931996

[18] J P Klein andM LMoeschberger Survival Analysis Techniquesfor Censored andTruncatedData Springer NewYorkNYUSA2th edition 2003

[19] JW Kay andDM Titterington Statistics andNeural NetworksOxford University Press Oxford UK 1999

[20] E P Goss and G S Vozikis ldquoImproving health care organiza-tional management through neural network learningrdquo HealthCare Management Science vol 5 pp 221ndash227 2002

[21] R M Ripley A L Harris and L Tarassenko ldquoNon-linearsurvival analysis using neural networksrdquo Statistics in Medicinevol 23 pp 825ndash842 2004

[22] P J G Lisboa H Wong P Harris and R Swindell ldquoA Bayesianneural network approach for modeling censored data withan application to prognosis after surgery for breast cancerrdquoArtificial Intelligence in Medicine vol 28 no 1 pp 1ndash25 2003

[23] C M Bishop Neural Networks for Pattern Recognition TheClarendon Press Oxford University Press New York NY USA1995

[24] B D Ripley Pattern Recognition and Neural Networks Cam-bridge University Press Cambridge UK 1996

[25] N Fallah G U Hong KMohammad et al ldquoNonlinear Poissonregression using neural networks a simulation studyrdquo NeuralComputing and Applications vol 18 no 8 pp 939ndash943 2009

[26] A Xiang P Lapuerta A Ryutov et al ldquoComparison of theperformance of neural network methods and Cox regressionfor censored survival datardquo Computational Statistics amp DataAnalysis vol 34 no 2 pp 243ndash257 2000

[27] F E Harrell R M Califf D B Pryor et al ldquoEvaluating theyield of medical testsrdquo The Journal of the American MedicalAssociation vol 247 pp 2543ndash2546 1982

[28] F E Harrell K L Lee R M Califf et al ldquoRegression modelingstrategies for improved prognostic predictionrdquo Statistics inMedicine vol 3 pp 143ndash152 1984

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article Nonlinear Survival Regression Using Artificial Neural …downloads.hindawi.com/journals/jps/2013/753930.pdf · 2019. 7. 31. · regression, is a popular method in

4 Journal of Probability and Statistics

Table1Describingof

mod

elslowast

andits

censoringstatus

Mod

elCensorin

gsta

tus

Mod

el(I)120582=exp(1198831+021198832)1198831simBe

r(05)1198832sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

Mod

el(II)120582=exp(1198831+021198832+0211988311198832)1198831simBe

r(05)1198832sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

Mod

el(III)120582=exp(1198831+1198832+051198833+0511988311198831+02511988311198833+02511988311198833)

1198831simBe

r(025)1198832simBe

r(05)1198833sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

Mod

el(IV)

120582=exp[2(1198831+1198832)+1(1198833)+05(1198834)+10(11988311198832+11988311198833+11988311198834)

+05(11988321198833+11988321198834+11988331198834)+025(119883111988321198833+119883111988321198834+119883111988331198834+119883211988331198834)

+05(1198831119883211988331198834)]1198831simBe

r(025)1198832simBe

r(05)1198833sim119873(01)1198834sim119873(01)

Averagec

ensorin

gfortrainingandtestingsetsequalto2030

40

5060

70

and

80

lowast

Theh

azardatanytim

e119905was

considered

asexpo

nentialformand

then

thes

urvivaltim

ewas

generated

Journal of Probability and Statistics 5

Table2Re

sults

ofANNmod

elsfor

simulated

data

of119867

MLP

(119867=2)

MLP

(119867=3)

MLP

(119867=4)

MLP

(119867=5)

MLP

(119867=10)

MLP

(119867=15)

MLP

(119867=20)

Mod

ellowast

BIClowastlowast

MSElowastlowastlowast

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

I20

0213

1055

0203

1148

0205

1252

0214

1302

0223

1405

0254

1387

0278

2235

30

0213

1145

0212

1149

0210

1241

0211

1249

0219

1411

0239

1422

0269

2222

40

0212

1134

0217

1163

0209

1321

0209

1392

0228

1433

0249

1473

0283

2314

50

0227

1243

0209

1187

0234

1449

0203

1218

0231

1429

0251

1521

0279

2543

60

0256

1289

0217

1251

0239

1443

0211

1325

0241

1511

0262

1611

0281

246

870

0281

1259

0226

1263

0259

1398

0232

1281

0224

1534

0273

1597

0280

2413

80

0278

1337

0232

1258

0266

1527

0223

1216

0249

1617

0261

2018

0266

2731

II 20

0229

1255

0223

1248

0235

1452

0234

1402

0323

2005

0264

1987

0269

2315

30

0254

1431

0282

1428

0235

1665

0243

1723

0259

1835

0263

2650

0276

2123

40

0231

1233

0218

1219

0221

1332

0220

1381

0249

1531

0257

1583

0291

2411

50

0237

1342

0222

1317

0241

1539

0213

1318

0251

1689

0266

1591

0280

2561

60

0256

1381

0245

1359

0248

1471

0232

1425

0255

1533

0282

1635

0279

2471

70

0281

1259

0236

1263

0263

1411

0223

1279

0263

1576

0291

1608

0291

240

980

0278

1337

0255

1358

0287

1597

0269

1221

0247

1523

0287

2107

0273

2729

III 20

0283

1261

0235

1274

0250

1408

0241

1299

0240

1539

0277

1607

0291

2433

30

0269

1344

0241

1262

0260

1532

0249

1284

0252

1623

0258

2208

0269

2751

40

0367

1204

0259

1888

0202

1883

0202

2329

0247

2428

0222

2222

0264

1895

50

0241

1358

0232

1321

0252

1541

0220

1327

0263

1699

0277

1603

0288

2593

60

0261

1421

0255

1362

0259

1485

0244

1432

0274

1543

0295

1688

0280

2528

70

0289

1371

0267

1274

0271

1422

0253

1269

0292

1552

0286

1662

0303

2687

80

0286

1345

0273

1366

0281

1622

0265

1482

0295

1538

0302

2134

0333

2869

IV 20

0243

1388

0269

1422

0239

1365

0241

1350

0270

1708

0288

1694

0293

2585

30

0267

1435

0259

1471

0230

1488

0237

1499

0280

1638

0285

1679

0311

2569

40

0290

1387

0273

1365

0282

1433

0268

1328

0288

1595

0296

1675

0298

2701

50

0292

1667

0281

1621

0283

1560

0271

1634

0279

1807

0288

1632

0302

2325

60

0298

1832

0204

1834

0199

2386

0161

2912

0158

2126

0159

2325

0173

2667

70

0238

2181

0239

2174

0247

2392

0237

2403

0253

2359

0233

2249

0256

2411

80

0300

2087

0301

1902

0321

2099

0292

2077

0295

2364

0289

2630

0322

2716

lowast Four

mod

elswith

different

rateso

fcensorin

glowastlowastAv

erageo

fBIC

forlearningset(700casesw

ith100replications)

lowastlowastlowastAv

erageo

fMSE

fortestin

gset(300casesw

ith100replications)

6 Journal of Probability and Statistics

Table 3 Results of concordance indexes of simulation study intesting subset (300 cases with 100 replications)

Modellowast ANN Cox RegI

20 0821 plusmn 0028 0820 plusmn 0031

30 0819 plusmn 0027 0820 plusmn 0030

40 0824 plusmn 0028 0823 plusmn 0030

50 0823 plusmn 0027 0823 plusmn 0029

60 0823 plusmn 0027 0822 plusmn 0030

70 0825 plusmn 0027 0822 plusmn 0032

80 0826 plusmn 0030 0823 plusmn 0031

II20 0818 plusmn 0029 0817 plusmn 0031

30 0817 plusmn 0034 0815 plusmn 0046

40 0818 plusmn 0032 0818 plusmn 0031

50 0816 plusmn 0035 0814 plusmn 0032

60 0816 plusmn 0031 0813 plusmn 0032

70 0814 plusmn 0033 0811 plusmn 0031

80 0812 plusmn 0034 0808 plusmn 0036

III20 0808 plusmn 0030 0798 plusmn 0035

30 0799 plusmn 0033 0790 plusmn 0033

40 0812 plusmn 0043 0795 plusmn 0038

50 0805 plusmn 0028 0795 plusmn 0030

60 0804 plusmn 0028 0790 plusmn 0033

70 0805 plusmn 0028 0791 plusmn 0034

80 0803 plusmn 0028 0790 plusmn 0032

IV20 0764 plusmn 0033 0759 plusmn 0036

30 0773 plusmn 0029 0762 plusmn 0033

40 0777 plusmn 0030 0759 plusmn 0035

50 0779 plusmn 0035 0764 plusmn 0034

60 0812 plusmn 0039 0790 plusmn 0036

70 0764 plusmn 0036 0744 plusmn 0035

80 0766 plusmn 0035 0741 plusmn 0037lowast

Four models with different rates of censoring

The model selection is based on BIC for learning set andSSE criterion for the testing subset data as a verification Theresults in Table 2 show that the simple model performs withless hidden node but complex model performs better withmore hidden nodes The MSE values confirm these results(Table 2)

In the next step to compare of ANN and Cox regressionpredictions concordance indexes were calculated from clas-sification accuracy table in testing subset Concordance indexwas reported as a generalization of the area under receiveroperating characteristic curve for censored data [27 28]Thisindexmeans that the proportion of the cases that are classifiedcorrectly in noncensored (event) and censored groups and 0to 1 values indicated as the ability of the models accuracyTheconcordance index of ANN and Cox regression models was

reported in Table 3The results of simulation study in simplermodel showed that there was not any different betweenthe predictions of Cox regression and NN models But NNpredictions were better than Cox regression predictions incomplex model with high rates of censoring

4 Conclusion

In this paper we presented two approaches for modeling ofsurvival data with different degrees of censoring Cox regres-sion and neural network models A Monte-Carlo simulationstudy was performed to compare predictive accuracy of Coxand neural network models in simulation data sets

In the simulation study four different models wereconsideredThe rate of censorship in each of thesemodelswasconsidered from 20 up 80These models were consideredwith the main effects and also with the interaction termsThen the ability of these models in prediction was evaluatedAs was seen in simple models and with less censored casesthere was little difference in ANN and Cox regressionmodelspredictions It seems that for simpler models the levels ofcensorship have no effect on predictions but the predictionsin more complex models depend on the levels of censorshipThe results showed that the NN model for more complexmodels was provided better predictions But for simplermodels predictions there was not any different in resultsThis result was consistent with the finding fromXiangrsquos study[26] Therefore NN model is proposed in two cases of (1)occurrence of high censorship (ie censoring rate of 60 andhigher) andor (2) in the complex models (ie with manycovariates and any interaction terms) This is a very goodresult and can be used in practical issues which often arefaced onwithmany numbers of variables and alsomany casesof censorship For that reason in these two cases the ANNstrategy can be used as an alternate of traditional Cox modelFinally it is mentioned that there are some flexible alternativemethods such as piecewise exponential and grouped timemodels which can be used for survival data and then its abilitycompared with ANNmodel

Acknowledgment

The authors wish to express their special thanks to refereesfor their valuable comments

References

[1] E T Lee and J W Wang Statistical Methods for SurvivalData Analysis Wiley Series in Probability and Statistics Wiley-Interscience Hoboken NJ USA 3rd edition 2003

[2] V Lagani and I Tsamardinos ldquoStructure-based variable selec-tion for survival datardquo Bioinformatics vol 26 no 15 pp 1887ndash1894 2010

[3] M H Kutner C J Nachtsheim and J Neter Applied LinearRegression Models McGraw-HillIrwin New York NY USA4th edition 2004

[4] WG Baxt and J Skora ldquoProspective validation of artificial neu-ral networks trained to identify acute myocardial infarctionrdquoLancet vol 347 pp 12ndash15 1996

Journal of Probability and Statistics 7

[5] B A Mobley E Schecheer and W E Moore ldquoPredictionof coronary artery stenosis by artificial networksrdquo ArtificialIntelligence in Medicine vol 18 pp 187ndash203 2000

[6] D West and V West ldquoModel selection for medical diagnosticdecision support system breast cancer detection caserdquoArtificialIntelligence in Medicine vol 20 pp 183ndash204 2000

[7] F Ambrogi N Lama P Boracchi and E Biganzoli ldquoSelectionof artificial neural network models for survival analysis withgenetic algorithmsrdquo Computational Statistics amp Data Analysisvol 52 no 1 pp 30ndash42 2007

[8] E Biganzoli P Boracchi LMariani andEMarubini ldquoFeed for-ward neural networks for the analysis of censored survival dataa partial logistic regression approachrdquo Statistics inMedicine vol17 pp 1169ndash1186 1998

[9] E Biganzoli P Boracchi and E Marubini ldquoA general frame-work for neural network models on censored survival datardquoNeural Networks vol 15 pp 209ndash218 2002

[10] E M Bignazoli and P Borrachi ldquoThe Partial Logistic ArtificialNeural Network (PLANN) a tool for the flexible modelling ofcensored survival datardquo in Proceedings of the European Confer-ence on EmergentAspects inClinical DataAnalysis (EACDA rsquo05)2005

[11] E M Bignazoli P Borrachi F Amborgini and E MarubinildquoArtificial neural network for the joint modeling of discretecause-specific hazardsrdquo Artificial Intelligence in Medicine vol37 pp 119ndash130 2006

[12] R Bittern A Cuschieri S D Dolgobrodov et al ldquoAn artificialneural network for analysing the survival of patients withcolorectal cancerrdquo in Proceedings of the European Symposium onArtificial Neural Networks (ESANN rsquo05) Bruges Belgium April2005

[13] K U Chen and C J Christian ldquoUsing back-propagation neuralnetwork to forecast the production values of the machineryindustry in Taiwanrdquo Journal of American Academy of BusinessCambridge vol 9 no 1 pp 183ndash190 2006

[14] C L Chia W Nick Street and H W William ldquoApplicationof artificial neural network-based survival analysis on twobreast cancer datasetsrdquo in Proceedings of the American MedicalInformatics Association Annual Symposium (AMIA rsquo07) pp130ndash134 Chicago Ill USA November 2007

[15] A Eleuteri R Tagliaferri and L Milano ldquoA novel neuralnetwork-based survival analysis modelrdquo Neural Networks vol16 pp 855ndash864 2003

[16] B D Ripley and R M Ripley ldquoNeural networks as statisticalmethods in survival analysisrdquo in Clinical Applications of Artifi-cial Neural Networks pp 237ndash255 Cambridge University PressCambridge UK 2001

[17] B Warner and M Manavendra ldquoUnderstanding neural net-works as statistical toolsrdquo Amstat vol 50 no 4 pp 284ndash2931996

[18] J P Klein andM LMoeschberger Survival Analysis Techniquesfor Censored andTruncatedData Springer NewYorkNYUSA2th edition 2003

[19] JW Kay andDM Titterington Statistics andNeural NetworksOxford University Press Oxford UK 1999

[20] E P Goss and G S Vozikis ldquoImproving health care organiza-tional management through neural network learningrdquo HealthCare Management Science vol 5 pp 221ndash227 2002

[21] R M Ripley A L Harris and L Tarassenko ldquoNon-linearsurvival analysis using neural networksrdquo Statistics in Medicinevol 23 pp 825ndash842 2004

[22] P J G Lisboa H Wong P Harris and R Swindell ldquoA Bayesianneural network approach for modeling censored data withan application to prognosis after surgery for breast cancerrdquoArtificial Intelligence in Medicine vol 28 no 1 pp 1ndash25 2003

[23] C M Bishop Neural Networks for Pattern Recognition TheClarendon Press Oxford University Press New York NY USA1995

[24] B D Ripley Pattern Recognition and Neural Networks Cam-bridge University Press Cambridge UK 1996

[25] N Fallah G U Hong KMohammad et al ldquoNonlinear Poissonregression using neural networks a simulation studyrdquo NeuralComputing and Applications vol 18 no 8 pp 939ndash943 2009

[26] A Xiang P Lapuerta A Ryutov et al ldquoComparison of theperformance of neural network methods and Cox regressionfor censored survival datardquo Computational Statistics amp DataAnalysis vol 34 no 2 pp 243ndash257 2000

[27] F E Harrell R M Califf D B Pryor et al ldquoEvaluating theyield of medical testsrdquo The Journal of the American MedicalAssociation vol 247 pp 2543ndash2546 1982

[28] F E Harrell K L Lee R M Califf et al ldquoRegression modelingstrategies for improved prognostic predictionrdquo Statistics inMedicine vol 3 pp 143ndash152 1984

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article Nonlinear Survival Regression Using Artificial Neural …downloads.hindawi.com/journals/jps/2013/753930.pdf · 2019. 7. 31. · regression, is a popular method in

Journal of Probability and Statistics 5

Table2Re

sults

ofANNmod

elsfor

simulated

data

of119867

MLP

(119867=2)

MLP

(119867=3)

MLP

(119867=4)

MLP

(119867=5)

MLP

(119867=10)

MLP

(119867=15)

MLP

(119867=20)

Mod

ellowast

BIClowastlowast

MSElowastlowastlowast

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

BIC

MSE

I20

0213

1055

0203

1148

0205

1252

0214

1302

0223

1405

0254

1387

0278

2235

30

0213

1145

0212

1149

0210

1241

0211

1249

0219

1411

0239

1422

0269

2222

40

0212

1134

0217

1163

0209

1321

0209

1392

0228

1433

0249

1473

0283

2314

50

0227

1243

0209

1187

0234

1449

0203

1218

0231

1429

0251

1521

0279

2543

60

0256

1289

0217

1251

0239

1443

0211

1325

0241

1511

0262

1611

0281

246

870

0281

1259

0226

1263

0259

1398

0232

1281

0224

1534

0273

1597

0280

2413

80

0278

1337

0232

1258

0266

1527

0223

1216

0249

1617

0261

2018

0266

2731

II 20

0229

1255

0223

1248

0235

1452

0234

1402

0323

2005

0264

1987

0269

2315

30

0254

1431

0282

1428

0235

1665

0243

1723

0259

1835

0263

2650

0276

2123

40

0231

1233

0218

1219

0221

1332

0220

1381

0249

1531

0257

1583

0291

2411

50

0237

1342

0222

1317

0241

1539

0213

1318

0251

1689

0266

1591

0280

2561

60

0256

1381

0245

1359

0248

1471

0232

1425

0255

1533

0282

1635

0279

2471

70

0281

1259

0236

1263

0263

1411

0223

1279

0263

1576

0291

1608

0291

240

980

0278

1337

0255

1358

0287

1597

0269

1221

0247

1523

0287

2107

0273

2729

III 20

0283

1261

0235

1274

0250

1408

0241

1299

0240

1539

0277

1607

0291

2433

30

0269

1344

0241

1262

0260

1532

0249

1284

0252

1623

0258

2208

0269

2751

40

0367

1204

0259

1888

0202

1883

0202

2329

0247

2428

0222

2222

0264

1895

50

0241

1358

0232

1321

0252

1541

0220

1327

0263

1699

0277

1603

0288

2593

60

0261

1421

0255

1362

0259

1485

0244

1432

0274

1543

0295

1688

0280

2528

70

0289

1371

0267

1274

0271

1422

0253

1269

0292

1552

0286

1662

0303

2687

80

0286

1345

0273

1366

0281

1622

0265

1482

0295

1538

0302

2134

0333

2869

IV 20

0243

1388

0269

1422

0239

1365

0241

1350

0270

1708

0288

1694

0293

2585

30

0267

1435

0259

1471

0230

1488

0237

1499

0280

1638

0285

1679

0311

2569

40

0290

1387

0273

1365

0282

1433

0268

1328

0288

1595

0296

1675

0298

2701

50

0292

1667

0281

1621

0283

1560

0271

1634

0279

1807

0288

1632

0302

2325

60

0298

1832

0204

1834

0199

2386

0161

2912

0158

2126

0159

2325

0173

2667

70

0238

2181

0239

2174

0247

2392

0237

2403

0253

2359

0233

2249

0256

2411

80

0300

2087

0301

1902

0321

2099

0292

2077

0295

2364

0289

2630

0322

2716

lowast Four

mod

elswith

different

rateso

fcensorin

glowastlowastAv

erageo

fBIC

forlearningset(700casesw

ith100replications)

lowastlowastlowastAv

erageo

fMSE

fortestin

gset(300casesw

ith100replications)

6 Journal of Probability and Statistics

Table 3 Results of concordance indexes of simulation study intesting subset (300 cases with 100 replications)

Modellowast ANN Cox RegI

20 0821 plusmn 0028 0820 plusmn 0031

30 0819 plusmn 0027 0820 plusmn 0030

40 0824 plusmn 0028 0823 plusmn 0030

50 0823 plusmn 0027 0823 plusmn 0029

60 0823 plusmn 0027 0822 plusmn 0030

70 0825 plusmn 0027 0822 plusmn 0032

80 0826 plusmn 0030 0823 plusmn 0031

II20 0818 plusmn 0029 0817 plusmn 0031

30 0817 plusmn 0034 0815 plusmn 0046

40 0818 plusmn 0032 0818 plusmn 0031

50 0816 plusmn 0035 0814 plusmn 0032

60 0816 plusmn 0031 0813 plusmn 0032

70 0814 plusmn 0033 0811 plusmn 0031

80 0812 plusmn 0034 0808 plusmn 0036

III20 0808 plusmn 0030 0798 plusmn 0035

30 0799 plusmn 0033 0790 plusmn 0033

40 0812 plusmn 0043 0795 plusmn 0038

50 0805 plusmn 0028 0795 plusmn 0030

60 0804 plusmn 0028 0790 plusmn 0033

70 0805 plusmn 0028 0791 plusmn 0034

80 0803 plusmn 0028 0790 plusmn 0032

IV20 0764 plusmn 0033 0759 plusmn 0036

30 0773 plusmn 0029 0762 plusmn 0033

40 0777 plusmn 0030 0759 plusmn 0035

50 0779 plusmn 0035 0764 plusmn 0034

60 0812 plusmn 0039 0790 plusmn 0036

70 0764 plusmn 0036 0744 plusmn 0035

80 0766 plusmn 0035 0741 plusmn 0037lowast

Four models with different rates of censoring

The model selection is based on BIC for learning set andSSE criterion for the testing subset data as a verification Theresults in Table 2 show that the simple model performs withless hidden node but complex model performs better withmore hidden nodes The MSE values confirm these results(Table 2)

In the next step to compare of ANN and Cox regressionpredictions concordance indexes were calculated from clas-sification accuracy table in testing subset Concordance indexwas reported as a generalization of the area under receiveroperating characteristic curve for censored data [27 28]Thisindexmeans that the proportion of the cases that are classifiedcorrectly in noncensored (event) and censored groups and 0to 1 values indicated as the ability of the models accuracyTheconcordance index of ANN and Cox regression models was

reported in Table 3The results of simulation study in simplermodel showed that there was not any different betweenthe predictions of Cox regression and NN models But NNpredictions were better than Cox regression predictions incomplex model with high rates of censoring

4 Conclusion

In this paper we presented two approaches for modeling ofsurvival data with different degrees of censoring Cox regres-sion and neural network models A Monte-Carlo simulationstudy was performed to compare predictive accuracy of Coxand neural network models in simulation data sets

In the simulation study four different models wereconsideredThe rate of censorship in each of thesemodelswasconsidered from 20 up 80These models were consideredwith the main effects and also with the interaction termsThen the ability of these models in prediction was evaluatedAs was seen in simple models and with less censored casesthere was little difference in ANN and Cox regressionmodelspredictions It seems that for simpler models the levels ofcensorship have no effect on predictions but the predictionsin more complex models depend on the levels of censorshipThe results showed that the NN model for more complexmodels was provided better predictions But for simplermodels predictions there was not any different in resultsThis result was consistent with the finding fromXiangrsquos study[26] Therefore NN model is proposed in two cases of (1)occurrence of high censorship (ie censoring rate of 60 andhigher) andor (2) in the complex models (ie with manycovariates and any interaction terms) This is a very goodresult and can be used in practical issues which often arefaced onwithmany numbers of variables and alsomany casesof censorship For that reason in these two cases the ANNstrategy can be used as an alternate of traditional Cox modelFinally it is mentioned that there are some flexible alternativemethods such as piecewise exponential and grouped timemodels which can be used for survival data and then its abilitycompared with ANNmodel

Acknowledgment

The authors wish to express their special thanks to refereesfor their valuable comments

References

[1] E T Lee and J W Wang Statistical Methods for SurvivalData Analysis Wiley Series in Probability and Statistics Wiley-Interscience Hoboken NJ USA 3rd edition 2003

[2] V Lagani and I Tsamardinos ldquoStructure-based variable selec-tion for survival datardquo Bioinformatics vol 26 no 15 pp 1887ndash1894 2010

[3] M H Kutner C J Nachtsheim and J Neter Applied LinearRegression Models McGraw-HillIrwin New York NY USA4th edition 2004

[4] WG Baxt and J Skora ldquoProspective validation of artificial neu-ral networks trained to identify acute myocardial infarctionrdquoLancet vol 347 pp 12ndash15 1996

Journal of Probability and Statistics 7

[5] B A Mobley E Schecheer and W E Moore ldquoPredictionof coronary artery stenosis by artificial networksrdquo ArtificialIntelligence in Medicine vol 18 pp 187ndash203 2000

[6] D West and V West ldquoModel selection for medical diagnosticdecision support system breast cancer detection caserdquoArtificialIntelligence in Medicine vol 20 pp 183ndash204 2000

[7] F Ambrogi N Lama P Boracchi and E Biganzoli ldquoSelectionof artificial neural network models for survival analysis withgenetic algorithmsrdquo Computational Statistics amp Data Analysisvol 52 no 1 pp 30ndash42 2007

[8] E Biganzoli P Boracchi LMariani andEMarubini ldquoFeed for-ward neural networks for the analysis of censored survival dataa partial logistic regression approachrdquo Statistics inMedicine vol17 pp 1169ndash1186 1998

[9] E Biganzoli P Boracchi and E Marubini ldquoA general frame-work for neural network models on censored survival datardquoNeural Networks vol 15 pp 209ndash218 2002

[10] E M Bignazoli and P Borrachi ldquoThe Partial Logistic ArtificialNeural Network (PLANN) a tool for the flexible modelling ofcensored survival datardquo in Proceedings of the European Confer-ence on EmergentAspects inClinical DataAnalysis (EACDA rsquo05)2005

[11] E M Bignazoli P Borrachi F Amborgini and E MarubinildquoArtificial neural network for the joint modeling of discretecause-specific hazardsrdquo Artificial Intelligence in Medicine vol37 pp 119ndash130 2006

[12] R Bittern A Cuschieri S D Dolgobrodov et al ldquoAn artificialneural network for analysing the survival of patients withcolorectal cancerrdquo in Proceedings of the European Symposium onArtificial Neural Networks (ESANN rsquo05) Bruges Belgium April2005

[13] K U Chen and C J Christian ldquoUsing back-propagation neuralnetwork to forecast the production values of the machineryindustry in Taiwanrdquo Journal of American Academy of BusinessCambridge vol 9 no 1 pp 183ndash190 2006

[14] C L Chia W Nick Street and H W William ldquoApplicationof artificial neural network-based survival analysis on twobreast cancer datasetsrdquo in Proceedings of the American MedicalInformatics Association Annual Symposium (AMIA rsquo07) pp130ndash134 Chicago Ill USA November 2007

[15] A Eleuteri R Tagliaferri and L Milano ldquoA novel neuralnetwork-based survival analysis modelrdquo Neural Networks vol16 pp 855ndash864 2003

[16] B D Ripley and R M Ripley ldquoNeural networks as statisticalmethods in survival analysisrdquo in Clinical Applications of Artifi-cial Neural Networks pp 237ndash255 Cambridge University PressCambridge UK 2001

[17] B Warner and M Manavendra ldquoUnderstanding neural net-works as statistical toolsrdquo Amstat vol 50 no 4 pp 284ndash2931996

[18] J P Klein andM LMoeschberger Survival Analysis Techniquesfor Censored andTruncatedData Springer NewYorkNYUSA2th edition 2003

[19] JW Kay andDM Titterington Statistics andNeural NetworksOxford University Press Oxford UK 1999

[20] E P Goss and G S Vozikis ldquoImproving health care organiza-tional management through neural network learningrdquo HealthCare Management Science vol 5 pp 221ndash227 2002

[21] R M Ripley A L Harris and L Tarassenko ldquoNon-linearsurvival analysis using neural networksrdquo Statistics in Medicinevol 23 pp 825ndash842 2004

[22] P J G Lisboa H Wong P Harris and R Swindell ldquoA Bayesianneural network approach for modeling censored data withan application to prognosis after surgery for breast cancerrdquoArtificial Intelligence in Medicine vol 28 no 1 pp 1ndash25 2003

[23] C M Bishop Neural Networks for Pattern Recognition TheClarendon Press Oxford University Press New York NY USA1995

[24] B D Ripley Pattern Recognition and Neural Networks Cam-bridge University Press Cambridge UK 1996

[25] N Fallah G U Hong KMohammad et al ldquoNonlinear Poissonregression using neural networks a simulation studyrdquo NeuralComputing and Applications vol 18 no 8 pp 939ndash943 2009

[26] A Xiang P Lapuerta A Ryutov et al ldquoComparison of theperformance of neural network methods and Cox regressionfor censored survival datardquo Computational Statistics amp DataAnalysis vol 34 no 2 pp 243ndash257 2000

[27] F E Harrell R M Califf D B Pryor et al ldquoEvaluating theyield of medical testsrdquo The Journal of the American MedicalAssociation vol 247 pp 2543ndash2546 1982

[28] F E Harrell K L Lee R M Califf et al ldquoRegression modelingstrategies for improved prognostic predictionrdquo Statistics inMedicine vol 3 pp 143ndash152 1984

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article Nonlinear Survival Regression Using Artificial Neural …downloads.hindawi.com/journals/jps/2013/753930.pdf · 2019. 7. 31. · regression, is a popular method in

6 Journal of Probability and Statistics

Table 3 Results of concordance indexes of simulation study intesting subset (300 cases with 100 replications)

Modellowast ANN Cox RegI

20 0821 plusmn 0028 0820 plusmn 0031

30 0819 plusmn 0027 0820 plusmn 0030

40 0824 plusmn 0028 0823 plusmn 0030

50 0823 plusmn 0027 0823 plusmn 0029

60 0823 plusmn 0027 0822 plusmn 0030

70 0825 plusmn 0027 0822 plusmn 0032

80 0826 plusmn 0030 0823 plusmn 0031

II20 0818 plusmn 0029 0817 plusmn 0031

30 0817 plusmn 0034 0815 plusmn 0046

40 0818 plusmn 0032 0818 plusmn 0031

50 0816 plusmn 0035 0814 plusmn 0032

60 0816 plusmn 0031 0813 plusmn 0032

70 0814 plusmn 0033 0811 plusmn 0031

80 0812 plusmn 0034 0808 plusmn 0036

III20 0808 plusmn 0030 0798 plusmn 0035

30 0799 plusmn 0033 0790 plusmn 0033

40 0812 plusmn 0043 0795 plusmn 0038

50 0805 plusmn 0028 0795 plusmn 0030

60 0804 plusmn 0028 0790 plusmn 0033

70 0805 plusmn 0028 0791 plusmn 0034

80 0803 plusmn 0028 0790 plusmn 0032

IV20 0764 plusmn 0033 0759 plusmn 0036

30 0773 plusmn 0029 0762 plusmn 0033

40 0777 plusmn 0030 0759 plusmn 0035

50 0779 plusmn 0035 0764 plusmn 0034

60 0812 plusmn 0039 0790 plusmn 0036

70 0764 plusmn 0036 0744 plusmn 0035

80 0766 plusmn 0035 0741 plusmn 0037lowast

Four models with different rates of censoring

The model selection is based on BIC for learning set andSSE criterion for the testing subset data as a verification Theresults in Table 2 show that the simple model performs withless hidden node but complex model performs better withmore hidden nodes The MSE values confirm these results(Table 2)

In the next step to compare of ANN and Cox regressionpredictions concordance indexes were calculated from clas-sification accuracy table in testing subset Concordance indexwas reported as a generalization of the area under receiveroperating characteristic curve for censored data [27 28]Thisindexmeans that the proportion of the cases that are classifiedcorrectly in noncensored (event) and censored groups and 0to 1 values indicated as the ability of the models accuracyTheconcordance index of ANN and Cox regression models was

reported in Table 3The results of simulation study in simplermodel showed that there was not any different betweenthe predictions of Cox regression and NN models But NNpredictions were better than Cox regression predictions incomplex model with high rates of censoring

4 Conclusion

In this paper we presented two approaches for modeling ofsurvival data with different degrees of censoring Cox regres-sion and neural network models A Monte-Carlo simulationstudy was performed to compare predictive accuracy of Coxand neural network models in simulation data sets

In the simulation study four different models wereconsideredThe rate of censorship in each of thesemodelswasconsidered from 20 up 80These models were consideredwith the main effects and also with the interaction termsThen the ability of these models in prediction was evaluatedAs was seen in simple models and with less censored casesthere was little difference in ANN and Cox regressionmodelspredictions It seems that for simpler models the levels ofcensorship have no effect on predictions but the predictionsin more complex models depend on the levels of censorshipThe results showed that the NN model for more complexmodels was provided better predictions But for simplermodels predictions there was not any different in resultsThis result was consistent with the finding fromXiangrsquos study[26] Therefore NN model is proposed in two cases of (1)occurrence of high censorship (ie censoring rate of 60 andhigher) andor (2) in the complex models (ie with manycovariates and any interaction terms) This is a very goodresult and can be used in practical issues which often arefaced onwithmany numbers of variables and alsomany casesof censorship For that reason in these two cases the ANNstrategy can be used as an alternate of traditional Cox modelFinally it is mentioned that there are some flexible alternativemethods such as piecewise exponential and grouped timemodels which can be used for survival data and then its abilitycompared with ANNmodel

Acknowledgment

The authors wish to express their special thanks to refereesfor their valuable comments

References

[1] E T Lee and J W Wang Statistical Methods for SurvivalData Analysis Wiley Series in Probability and Statistics Wiley-Interscience Hoboken NJ USA 3rd edition 2003

[2] V Lagani and I Tsamardinos ldquoStructure-based variable selec-tion for survival datardquo Bioinformatics vol 26 no 15 pp 1887ndash1894 2010

[3] M H Kutner C J Nachtsheim and J Neter Applied LinearRegression Models McGraw-HillIrwin New York NY USA4th edition 2004

[4] WG Baxt and J Skora ldquoProspective validation of artificial neu-ral networks trained to identify acute myocardial infarctionrdquoLancet vol 347 pp 12ndash15 1996

Journal of Probability and Statistics 7

[5] B A Mobley E Schecheer and W E Moore ldquoPredictionof coronary artery stenosis by artificial networksrdquo ArtificialIntelligence in Medicine vol 18 pp 187ndash203 2000

[6] D West and V West ldquoModel selection for medical diagnosticdecision support system breast cancer detection caserdquoArtificialIntelligence in Medicine vol 20 pp 183ndash204 2000

[7] F Ambrogi N Lama P Boracchi and E Biganzoli ldquoSelectionof artificial neural network models for survival analysis withgenetic algorithmsrdquo Computational Statistics amp Data Analysisvol 52 no 1 pp 30ndash42 2007

[8] E Biganzoli P Boracchi LMariani andEMarubini ldquoFeed for-ward neural networks for the analysis of censored survival dataa partial logistic regression approachrdquo Statistics inMedicine vol17 pp 1169ndash1186 1998

[9] E Biganzoli P Boracchi and E Marubini ldquoA general frame-work for neural network models on censored survival datardquoNeural Networks vol 15 pp 209ndash218 2002

[10] E M Bignazoli and P Borrachi ldquoThe Partial Logistic ArtificialNeural Network (PLANN) a tool for the flexible modelling ofcensored survival datardquo in Proceedings of the European Confer-ence on EmergentAspects inClinical DataAnalysis (EACDA rsquo05)2005

[11] E M Bignazoli P Borrachi F Amborgini and E MarubinildquoArtificial neural network for the joint modeling of discretecause-specific hazardsrdquo Artificial Intelligence in Medicine vol37 pp 119ndash130 2006

[12] R Bittern A Cuschieri S D Dolgobrodov et al ldquoAn artificialneural network for analysing the survival of patients withcolorectal cancerrdquo in Proceedings of the European Symposium onArtificial Neural Networks (ESANN rsquo05) Bruges Belgium April2005

[13] K U Chen and C J Christian ldquoUsing back-propagation neuralnetwork to forecast the production values of the machineryindustry in Taiwanrdquo Journal of American Academy of BusinessCambridge vol 9 no 1 pp 183ndash190 2006

[14] C L Chia W Nick Street and H W William ldquoApplicationof artificial neural network-based survival analysis on twobreast cancer datasetsrdquo in Proceedings of the American MedicalInformatics Association Annual Symposium (AMIA rsquo07) pp130ndash134 Chicago Ill USA November 2007

[15] A Eleuteri R Tagliaferri and L Milano ldquoA novel neuralnetwork-based survival analysis modelrdquo Neural Networks vol16 pp 855ndash864 2003

[16] B D Ripley and R M Ripley ldquoNeural networks as statisticalmethods in survival analysisrdquo in Clinical Applications of Artifi-cial Neural Networks pp 237ndash255 Cambridge University PressCambridge UK 2001

[17] B Warner and M Manavendra ldquoUnderstanding neural net-works as statistical toolsrdquo Amstat vol 50 no 4 pp 284ndash2931996

[18] J P Klein andM LMoeschberger Survival Analysis Techniquesfor Censored andTruncatedData Springer NewYorkNYUSA2th edition 2003

[19] JW Kay andDM Titterington Statistics andNeural NetworksOxford University Press Oxford UK 1999

[20] E P Goss and G S Vozikis ldquoImproving health care organiza-tional management through neural network learningrdquo HealthCare Management Science vol 5 pp 221ndash227 2002

[21] R M Ripley A L Harris and L Tarassenko ldquoNon-linearsurvival analysis using neural networksrdquo Statistics in Medicinevol 23 pp 825ndash842 2004

[22] P J G Lisboa H Wong P Harris and R Swindell ldquoA Bayesianneural network approach for modeling censored data withan application to prognosis after surgery for breast cancerrdquoArtificial Intelligence in Medicine vol 28 no 1 pp 1ndash25 2003

[23] C M Bishop Neural Networks for Pattern Recognition TheClarendon Press Oxford University Press New York NY USA1995

[24] B D Ripley Pattern Recognition and Neural Networks Cam-bridge University Press Cambridge UK 1996

[25] N Fallah G U Hong KMohammad et al ldquoNonlinear Poissonregression using neural networks a simulation studyrdquo NeuralComputing and Applications vol 18 no 8 pp 939ndash943 2009

[26] A Xiang P Lapuerta A Ryutov et al ldquoComparison of theperformance of neural network methods and Cox regressionfor censored survival datardquo Computational Statistics amp DataAnalysis vol 34 no 2 pp 243ndash257 2000

[27] F E Harrell R M Califf D B Pryor et al ldquoEvaluating theyield of medical testsrdquo The Journal of the American MedicalAssociation vol 247 pp 2543ndash2546 1982

[28] F E Harrell K L Lee R M Califf et al ldquoRegression modelingstrategies for improved prognostic predictionrdquo Statistics inMedicine vol 3 pp 143ndash152 1984

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article Nonlinear Survival Regression Using Artificial Neural …downloads.hindawi.com/journals/jps/2013/753930.pdf · 2019. 7. 31. · regression, is a popular method in

Journal of Probability and Statistics 7

[5] B A Mobley E Schecheer and W E Moore ldquoPredictionof coronary artery stenosis by artificial networksrdquo ArtificialIntelligence in Medicine vol 18 pp 187ndash203 2000

[6] D West and V West ldquoModel selection for medical diagnosticdecision support system breast cancer detection caserdquoArtificialIntelligence in Medicine vol 20 pp 183ndash204 2000

[7] F Ambrogi N Lama P Boracchi and E Biganzoli ldquoSelectionof artificial neural network models for survival analysis withgenetic algorithmsrdquo Computational Statistics amp Data Analysisvol 52 no 1 pp 30ndash42 2007

[8] E Biganzoli P Boracchi LMariani andEMarubini ldquoFeed for-ward neural networks for the analysis of censored survival dataa partial logistic regression approachrdquo Statistics inMedicine vol17 pp 1169ndash1186 1998

[9] E Biganzoli P Boracchi and E Marubini ldquoA general frame-work for neural network models on censored survival datardquoNeural Networks vol 15 pp 209ndash218 2002

[10] E M Bignazoli and P Borrachi ldquoThe Partial Logistic ArtificialNeural Network (PLANN) a tool for the flexible modelling ofcensored survival datardquo in Proceedings of the European Confer-ence on EmergentAspects inClinical DataAnalysis (EACDA rsquo05)2005

[11] E M Bignazoli P Borrachi F Amborgini and E MarubinildquoArtificial neural network for the joint modeling of discretecause-specific hazardsrdquo Artificial Intelligence in Medicine vol37 pp 119ndash130 2006

[12] R Bittern A Cuschieri S D Dolgobrodov et al ldquoAn artificialneural network for analysing the survival of patients withcolorectal cancerrdquo in Proceedings of the European Symposium onArtificial Neural Networks (ESANN rsquo05) Bruges Belgium April2005

[13] K U Chen and C J Christian ldquoUsing back-propagation neuralnetwork to forecast the production values of the machineryindustry in Taiwanrdquo Journal of American Academy of BusinessCambridge vol 9 no 1 pp 183ndash190 2006

[14] C L Chia W Nick Street and H W William ldquoApplicationof artificial neural network-based survival analysis on twobreast cancer datasetsrdquo in Proceedings of the American MedicalInformatics Association Annual Symposium (AMIA rsquo07) pp130ndash134 Chicago Ill USA November 2007

[15] A Eleuteri R Tagliaferri and L Milano ldquoA novel neuralnetwork-based survival analysis modelrdquo Neural Networks vol16 pp 855ndash864 2003

[16] B D Ripley and R M Ripley ldquoNeural networks as statisticalmethods in survival analysisrdquo in Clinical Applications of Artifi-cial Neural Networks pp 237ndash255 Cambridge University PressCambridge UK 2001

[17] B Warner and M Manavendra ldquoUnderstanding neural net-works as statistical toolsrdquo Amstat vol 50 no 4 pp 284ndash2931996

[18] J P Klein andM LMoeschberger Survival Analysis Techniquesfor Censored andTruncatedData Springer NewYorkNYUSA2th edition 2003

[19] JW Kay andDM Titterington Statistics andNeural NetworksOxford University Press Oxford UK 1999

[20] E P Goss and G S Vozikis ldquoImproving health care organiza-tional management through neural network learningrdquo HealthCare Management Science vol 5 pp 221ndash227 2002

[21] R M Ripley A L Harris and L Tarassenko ldquoNon-linearsurvival analysis using neural networksrdquo Statistics in Medicinevol 23 pp 825ndash842 2004

[22] P J G Lisboa H Wong P Harris and R Swindell ldquoA Bayesianneural network approach for modeling censored data withan application to prognosis after surgery for breast cancerrdquoArtificial Intelligence in Medicine vol 28 no 1 pp 1ndash25 2003

[23] C M Bishop Neural Networks for Pattern Recognition TheClarendon Press Oxford University Press New York NY USA1995

[24] B D Ripley Pattern Recognition and Neural Networks Cam-bridge University Press Cambridge UK 1996

[25] N Fallah G U Hong KMohammad et al ldquoNonlinear Poissonregression using neural networks a simulation studyrdquo NeuralComputing and Applications vol 18 no 8 pp 939ndash943 2009

[26] A Xiang P Lapuerta A Ryutov et al ldquoComparison of theperformance of neural network methods and Cox regressionfor censored survival datardquo Computational Statistics amp DataAnalysis vol 34 no 2 pp 243ndash257 2000

[27] F E Harrell R M Califf D B Pryor et al ldquoEvaluating theyield of medical testsrdquo The Journal of the American MedicalAssociation vol 247 pp 2543ndash2546 1982

[28] F E Harrell K L Lee R M Califf et al ldquoRegression modelingstrategies for improved prognostic predictionrdquo Statistics inMedicine vol 3 pp 143ndash152 1984

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article Nonlinear Survival Regression Using Artificial Neural …downloads.hindawi.com/journals/jps/2013/753930.pdf · 2019. 7. 31. · regression, is a popular method in

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of