A Robast DM Approach to Bankruptcy Prediction

  • Upload
    -

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    1/20

    Journal of ForecastingJ. Forecast. 31, 504523 (2012)Published online 22 March 2011 in Wiley Online Library(wileyonlinelibrary.com) DOI: 10.1002/for.1232

    A Robust Data-Mining Approach toBankruptcy Prediction

    MEHDI DIVSALAR,1* HABIB ROODSAZ,1

    FARSHAD VAHDATINIA,2 GHASSEM NOROUZZADEH3

    AND AMIR HOSSEIN BEHROOZ1

    1 Faculty of Management and Accounting, Allameh Tabatabai

    University, Tehran, Iran2 Department of Civil Engineering, Ferdowsi University of

    Mashhad, Mashhad, Iran3 Faculty of Management, University of Tehran, Tehran, Iran

    ABSTRACT

    In this study, new variants of genetic programming (GP), namely gene expres-sion programming (GEP) and multi-expression programming (MEP), areutilized to build models for bankruptcy prediction. Generalized relationshipsare obtained to classify samples of 136 bankrupt and non-bankrupt Iraniancorporations based on their financial ratios. An important contribution of thispaper is to identify the effective predictive financial ratios on the basis of anextensive bankruptcy prediction literature review and upon a sequential featureselection analysis. The predictive performance of the GEP and MEP forecast-ing methods is compared with the performance of traditional statistical methodsand a generalized regression neural network. The proposed GEP and MEPmodels are effectively capable of classifying bankrupt and non-bankrupt firmsand outperform the models developed using other methods. Copyright 2011John Wiley & Sons, Ltd.

    KEY WORDS bankruptcy prediction; gene expression programming;multi-expression programming; sequential feature selection;financial ratios

    INTRODUCTION

    Univariate and multivariate analyses are two basic types of studies to predict managerial bankruptcy.

    Univariate analysis takes into account the relationship between individual figures or ratios and

    bankruptcy. Multivariate analysis uses multiple ratios and weighting to determine a prediction func-

    tion of bankruptcy. Fitzpatrick (1931) was the first researcher to use ratio analysis to compare failed

    or non-failed firms. A univariate analysis of 13 ratios was used to identify business failure. Beaver(1966) carried out the first significant work in the area of bankruptcy prediction using univariate

    analysis. Beaver (1966) introduced a univariate technique for the classification of firms into two

    * Correspondence to: Mehdi Divsalar, Faculty of Management and Accounting, Allameh Tabatabai University, Tehran, Iran.E-mail: [email protected]

    Copyright 2011 John Wiley & Sons, Ltd.

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    2/20

    A Robust Data-Mining Approach to Bankruptcy Prediction 505

    groups based on some financial ratios. The ratios were individually used and a cut-off score was

    calculated for each ratio based on minimizing misclassification. Despite their considerable results,

    univariate-based methods were later criticized due to correlation among ratios and for providing

    different signals for a firm by the ratios (Dimitras et al., 1996). Altman (1968) expanded Beavers

    univariate analysis by using multiple discriminant analysis (MDA). Various bankrupt and non-bankrupt groups and a variety of different ratio groups were used by Altman (1968). After about

    four decades, Altmans Z-score is still widely regarded by researchers as an indicator of a companys

    financial well-being (Divsalar et al., 2011). In accordance with Altman et al. (1981), the four steps in

    the development of bankruptcy prediction models are:

    (i) analyzing groups of failed and non-failed firms to identify the most dissimilar financial

    characteristics between the groups prior to bankruptcy;

    (ii) reclassifying the original sample using financial characteristics;

    (iii) testing the models predictive ability on a holdout sample;

    (iv) using the model to predict future bankruptcies (Divsalar et al., 2011).

    Altman (1993) proposed a revised model to incorporate a four variable Z-score prediction model.

    Although the majority of international failure prediction studies employ MDA (Charitou et al., 2004;Li and Sun, 2010), questions were raised regarding the restrictive statistical requirements imposed by

    such methods (Ohlson, 1980). By this time, various methods had been introduced to overcome the

    shortcomings of MDA and to improve the accuracy of bankruptcy prediction. In general, there are

    two main groups of techniques for handling this issue (Divsalar et al., 2011). The first group consists

    of statistical techniques such as Logit (Foreman, 2003; Lin, 2009; Psillaki et al., 2010; Li and Sun,

    2010), Probit (Theodossiou, 1991; Fukuda et al., 2009), linear probability (Stone and Rasp, 1991;

    Vranas, 1992), and cumulative sums (Kahya and Theodossiou, 1999). The second group belongs to

    computational intelligence techniques. Some of the computational intelligence techniques utilized in

    this area are genetic algorithms (Shin and Lee, 2002; Wu et al., 2007; Ahn and Kim, 2009), case-

    based reasoning (Park and Han, 2002), rough sets (Dimitras et al., 1999; McKee and Lensberg, 2002;

    Sanchis et al., 2007), support vector machine (Min and Lee, 2005), and artificial neural network

    (Bentz and Merunka, 2000; Charitou et al., 2004; Ravi and Pramodh, 2008; Chauhan et al., 2009;Lin, 2009; Divsalar et al., 2011). Despite the high accuracy of computational intelligence methods,

    they suffer from the absence of a bankruptcy theory. A comprehensive survey on bankruptcy

    prediction methods can be found in Dimitras et al. (1996), Jones (1987), and Kumar and Ravi (2007).

    Advances in the field of bankruptcy prediction have continued to be made. Genetic programming

    (GP) (Koza, 1992; Banzhaf et al., 1998) is a developing subarea of evolutionary algorithms. GP is a

    supervised machine-learning technique that searches a program space instead of a data space (Banzhaf

    et al., 1998; Gandomi et al., 2011). There have been efforts directed at applying GP to the bankruptcy

    prediction problem (e.g. Lensberg et al., 2006; Etemadi et al., 2008). Recently, Divsalar et al. (2011)

    have employed a new variant of GP, called linear genetic programming (LGP) to classify samples

    of bankrupt and non-bankrupt Iranian corporations. Gene expression programming (GEP) (Ferreira,

    2001) is a recent extension to GP. GEP evolves computer programs of different sizes and shapes

    encoded in linear chromosomes of fixed length (Gandomi et al., 2011). Multi-expression program-ming (MEP) (Oltean and Dumitrescu, 2002) is another new variant of GP with a linear representation

    of chromosomes. Based on numerical experiments, GEP and MEP approaches can be utilized as

    efficient alternatives to traditional GP (Oltean and Grossan, 2003a; Alavi et al., 2010).

    The main purpose of this research is to derive new models for classifying bankrupt and non-

    bankrupt Iranian firms using GEP and MEP. The proposed models are developed from financial data

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    3/20

    506 M. Divsalar et al.

    of 65 bankrupt and 71 non-bankrupt firms (e.g., automotive, construction engineering, petrochemical

    corporations) listed on the Tehran Stock Exchange over the years 19992006. A multi-stage strategy

    considered for the selection of effective predictive financial ratios is further described.

    GENETIC PROGRAMMING

    GP is an optimization method that creates computer programs to solve a problem using Darwins

    theory of evolution (Koza, 1992; Gandomi et al., 2011; Divsalar et al., 2011). It was introduced

    by Koza (1992) as an extension of genetic algorithms (GAs). In GP, a random population of com-

    puter programs is created to achieve high diversity. Symbolic optimization algorithms like GP present

    potential solutions by structural ordering of several symbols. A population member in GP is a hier-

    archically structured tree including functions and terminals. The functions and terminals are selected

    from a set of functions and a set of terminals. The function set F can contain basic arithmetic

    operations, Boolean logic functions, or any other mathematical functions. The terminal set T com-

    prises the arguments for the functions and can consist of numerical constants, logical constants, or

    variables. The functions and terminals are randomly chosen. They are constructed together to form

    a tree-like structure with a root point with branches extending from each function and ending in aterminal (Gandomi et al., 2011; Divsalar et al., 2011).

    Once a population of models has been created at random, GP evaluates the individuals, selects

    individuals for reproduction, and creates new individuals by mutation, crossover, and direct repro-

    duction (Koza, 1992). During crossover, a point on a branch of each program is randomly selected.

    As shown in Figure 1, the set of terminals and/or functions from each program are then swapped to

    create two new programs. The process continues by evaluating the fitness values of the new popula-

    tion and starting a new round of reproduction and crossover. The GP algorithm occasionally selects

    a function or terminal from a model at random and applies the mutation operator to it (see Figure 2)

    (Alavi et al., 2011). GEP and MEP are linear variants of GP. Individuals in these linear variants of

    GP are represented as linear strings (Oltean and Grossan, 2003a; Gandomi et al., 2011).

    Gene expression programming

    GEP was first introduced by Ferreira (2001). Most of the genetic operators used in GAs can also be

    used in GEP with minor changes. GEP consists of function set, terminal set, fitness function, control

    parameters, and termination condition. GEP uses a fixed length of character strings to represent

    -

    +

    X2X2

    X1

    SQ

    SQ

    X0

    +

    SQX1

    +

    X1

    X2SQ

    -

    X0

    X1

    +

    X2

    Parent 1 Parent 2 Child 1 Child 2

    LogLog

    Figure 1. Typical crossover operation in genetic programming

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    4/20

    A Robust Data-Mining Approach to Bankruptcy Prediction 507

    -

    SQ

    X0

    2

    X0

    2SQ

    +

    Figure 2. Typical mutation operation in genetic programming

    programs (Gandomi et al., 2011). The solutions are afterwards expressed as parse trees of different

    sizes and shapes. These trees are called GEP expression trees (ETs). In GEP, the creation of genetic

    diversity is extremely simplified as genetic operators work at the chromosome level. The multigenic

    nature of GEP allows the evolution of more complex programs composed of several subprograms.

    Each GEP gene consists of a list of symbols with a fixed length (Gandomi et al., 2011). A typicalGEP gene with the given function and terminal sets is as follows:

    log. . C .x0. . . C . . x1. x0. x2.4. x1. x3 (1)where x0, . . . , x3 are variables and 4 is a constant; . is an element separator for easy reading. The

    above expression is termed a Karva notation or K-expression (Ferreira 2001, 2006; Zhou et al., 2002).

    A K-expression can be represented by a diagram which is an ET. For example, the above sample gene

    can be expressed as in Figure 3.

    The conversion starts from the first position in the K-expression, which corresponds to the root of

    the ET, and reads through the string one by one (Gandomi et al., 2011). The above GEP gene can be

    expressed in mathematical form as

    log .x0 ..x2 C 4/ .x1 x3/// C .x1 x0/ (2)An ET can inversely be transformed to a K-expression by recording the nodes from left to right in

    each layer of the ET, from root layer down to the deepest one, to form the string. GEP genes have fixed

    length. Thus what varies in GEP is not the length of genes but the size of the corresponding ETs. To

    -

    +

    x2 x14 x3

    x1 x0

    -

    x0

    Log

    +

    Figure 3. Example of expression trees (ETs)

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    5/20

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    6/20

    A Robust Data-Mining Approach to Bankruptcy Prediction 509

    set of arithmetic operators F D f, C, =g and the set of terminals T D fx0, x1, x2g, a typical MEPchromosome is as follows:

    0: x01: x1

    2: 0, 13: x24: = 2, 3

    5: C 4, 3Translation of the MEP individuals into programs can be obtained by reading the chromosome top

    down starting with the first position. A terminal symbol represents a simple expression and each of

    function symbols specifies a complex expression achieved by connecting the operands specified by the

    argument positions with the current function symbol (Oltean and Grossan, 2003b; Alavi et al., 2010).

    In the given example, genes 0, 1, 3 and 5 encode simple expressions formed by a single terminal

    symbol. These expressions are: E0 D x0; E1 D x1; E3 D x2. Gene 2 indicates the operation on the operands located at positions 0 and 1 of the chromosome. Therefore gene 2 encodesthe expression: E2 D x0 x1. Gene 4 indicates the operation = on the operands located at positions2 and 3. Therefore gene 4 encodes the expression: E4 D .x0 x1/=x2. Gene 5 indicates the oper-ation C on the operands located at positions 4 and 3. Therefore gene 6 encodes the expression:E6 D ..x0 x1/=x2/ C x2. In order to choose one of these evolved expressions (E1 to E5/ as thechromosome representer, the fitness of each expression encoded in an MEP chromosome is calculated

    (Alavi et al., 2010). To solve a symbolic regression problem, the fitness of an MEP chromosome may

    be computed using the following equation (Oltean and Grossan, 2003a):

    f D miniD1,m

    (nX

    jD1

    jEj Oij j)

    (4)

    in which n is the number of fitness cases, Ej is the expected value for the fitness case j , Oij

    is the

    value returned for the j th fitness case by the ith expression encoded in the current chromosome andm is the number of chromosome genes (Alavi et al., 2010).

    DEVELOPMENT OF MATHEMATICAL MODELS FOR BANKRUPTCY PREDICTION

    Bankruptcy is a condition in which a business cannot meet its debt obligations and petitions a federal

    district court for either reorganization of its debts or liquidation of its assets. In action, the property of

    a debtor is taken over by a receiver or trustee in bankruptcy for the benefit of the creditors. Signs of

    potential corporate failure are evident long before the actual bankruptcy materializes (Divsalar et al.,

    2011). Accurate prediction of the declining business activity that leads to bankruptcy allows time

    for managers and creditors to take corrective action. The financial state of an enterprise is assessed

    according to its various factors. Financial ratios are usually calculated for this purpose. In this way,ratios, not depending on the size of an enterprise, are obtained and this allows comparison of enter-

    prises of different size. To identify the possibility of bankruptcy, a linear score function is often used

    as follows (Divsalar et al., 2011).

    Z D k1R1 C K2R2 C : : : C kmRm (5)

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    7/20

    510 M. Divsalar et al.

    in which R1, R2, . . . , Rm are financial ratios. The Z value describes the possibility of a bankruptcy in

    an enterprise. This value can define low, medium, and high possibilities of bankruptcy. Every method

    has different financial ratios (Ri/ and different coefficients (ki/ (Divsalar et al., 2011).

    In this paper, the GEP and MEP techniques were used to predict the survival or failure of Iranian

    corporations. Various effective predictive financial ratios were used as input parameters. The observedoutput variable is determined by whether it exceeds a threshold value 0.5 (rounding threshold). When

    return of the GEP and MEP models is greater than or equal to 0.5, this firm is marked as bankrupt

    firm. Alternatively, when return of the GEP and MEP models is less than 0.5, this firm is classified

    as non-bankrupt firm. The bankruptcy class (BC) formulation was considered to be as follows:

    BCGEP,MEP D f .R1, R2, : : : , Rm/ (6)

    BC D

    1, if BCGEP,MEP 0.50, if BCGEP,MEP < 0.5

    (7)

    where R1,R2, . . . , Rm are predictive financial ratios. 0 and 1 are codes representing the non-bankrupt

    and bankrupt firms, respectively.

    Performance measures

    For a more detailed performance analysis, the sensitivity, specificity, positive predictivity and accu-

    racy values of the proposed models were obtained using the following equations (Divsalar et al.,

    2011):

    Sensitivity .%/ D TPTP C FN 100 (8)

    Specificity .%/ D TNTN

    CFP

    100 (9)

    Positive predictivity .%/ D TPTP C FP 100 (10)

    Accuracy .%/ D TP C TNTP C FP C FN C TN 100 (11)

    where

    TP (true positive) the model predicts that the class is 1 and the class of the given instance is indeed 1;

    TN (true negative) the model predicts that the class is 0 and the class of the given instance is indeed 0;

    FP (false positive) the model predicts that the class is 1 but the class of the given instance is 0;

    FN (false negative) the model predicts that the class is 0 but the class of the given instance is 1.

    The receiver operating characteristic (ROC) curves were also used to visualize the detection perfor-

    mance of the classifiers on the entire database. The selected index of performance was the area (A/

    under the ROC curves, which is a meaningful performance measure. Generally, a higher area index

    reflects a better diagnostic performance (Divsalar et al., 2011).

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    8/20

    A Robust Data-Mining Approach to Bankruptcy Prediction 511

    Experimental design

    The data used for the model development consisted of the financial data from 65 bankrupt and 71

    non-bankrupt Iranian companies between the years 1999 and 2006. The datasets were obtained from

    the Tehran Stock Exchange. This database was already employed by Divsalar et al. (2011) to build

    models for the bankruptcy prediction based on linear genetic programming and radial basis functionneural network methodologies. Several industries are involved in the Tehran Stock Exchange such as

    the automotive, construction engineering, telecommunications, agriculture, petrochemical, mining,

    banking and insurance and others trade shares. Under paragraph 141 of Iran Trade Law, a firm is

    bankrupt when its total value of retained earnings is equal to or greater than 50% of its listed capital.

    Size of the firms as a potential explanatory variable was considered in the variable selection phase.

    According to Lensberg et al. (2006), a common approach to predict bankruptcy is to survey the

    literature to identify a large set of potential predictive financial and/or non-financial variables. The

    next step is to develop a reduced set of variables, through some combination of judgmental and math-

    ematical analysis that will predict bankruptcy. In this study, a two-step procedure was used to select

    input variables (Divsalar et al., 2011). At the first stage, 25 variables among more than 45 financial

    ratios were selected based on their popularity in the literature. The selected variables are illustrated in

    Table I. In the second stage, variables which had less significant discrimination ability were removedby means of a sequential feature selection (SFS) analysis. The SFS technique is a common method

    of feature selection to reduce the dimensionality of data. The SFS technique has two components:

    (1) an objective function, called the criterion, which is the mean squared error for regression models

    and misclassification rate for classification models; and (2) a sequential search algorithm which adds

    or removes features from a candidate subset while evaluating the criterion. The SFS method has two

    variants (MathWorks, 2007):

    (i) sequential forward selection, in which predictor variables are sequentially added to an empty

    candidate set until the addition of further variables does not decrease the criterion;

    (ii) sequential backward selection, in which predictor variables are sequentially removed from a full

    candidate set until the removal of further variables increase the criterion.

    Table I. Variables description

    Variable Description (ratios) Variable Description (ratios)

    R1 Sales to Current assets R14 Earnings before interest and taxes interestto Sales

    R2 Operational income to Sales R15 Operational income to Total assetsR3 Quick assets to Total assets R16 Net income to SalesR4 Total liability to Total assets R17 Retained earnings to Total assetsR5 Current liabilities to Total liabilities R18 Cash to Total assetsR6 Sales to Fixed assets R19 Quick assets to Current liabilitiesR7 Gross profit to Sales R20 Receivables to SalesR8 Earnings before interest and taxes R21 Marked value of equity to Total liabilities

    to Interest expenses

    R9 Net income to Total assets R22 Receivables to InventoryR10 Net income to Shareholders equity R23 Long-term debt to Shareholders equityR11 Sales to Working capital R24 Cash to Total assetsR12 Cash to Current liabilities R25 Retained earnings to Stock capitalR13 Sales to Shareholders equity

    Note: R1,R2,R3 andR4 are final variables selected by the SFS analysis.

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    9/20

    512 M. Divsalar et al.

    In the present study, sequential backward selection was used to select the final variables. The statistics

    Toolbox in MATLAB was used to perform the SFS analysis. The SFS procedure selected four vari-

    ables from the 25 candidates which could best discriminate the bankrupt firms from the non-bankrupt

    firms (Divsalar et al., 2011). The selected financial ratios are:

    R1 W sales to current assets ratio;R2 W operational income to sales ratio;R3 W quick assets to total assets ratio;R4 W total liability to total assets ratio;

    The same results were reported by Divsalar et al. (2011). The box plots and feature space plots of the

    four selected financial ratios are presented in Figures 5 and 6, respectively (Divsalar et al., 2011). As

    can be seen, the patterns related to the bankrupt and non-bankrupt classes are located close to each

    other and are relatively well separated from the other class within the feature space. Therefore, the

    reduced financial ratio set not only increases the classification procedure in the next step but also pro-

    vides an appropriate tool for a better discrimination of two classes (Divsalar et al., 2011). According

    0 1-0.5

    0

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    4

    Values

    Column Number

    R1

    0 1

    0

    0.5

    1

    1.5

    Values

    Column Number

    R2

    0 1

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    Values

    Column Number

    R3

    0 1

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    Values

    Column Number

    R4

    (a) (b)

    (c) (d)

    Figure 5. Box plots of selected financial ratios (0 D bankrupt; 1 D non-bankrupt)

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    10/20

    A Robust Data-Mining Approach to Bankruptcy Prediction 513

    Figure 6. Feature space plots of selected financial ratios

    to Dimitras et al. (1996), these financial ratios are popular predictive variables in the bankruptcy

    prediction literature.

    Overfitting is one of the essential problems in machine-learning generalization. An efficient

    approach to prevent overfitting is to test the derived models on a validation set to find a better general-

    ization (Banzhafet al., 1998). This approach was used in this study for improving the generalization

    of the models. With this aim in mind, the available datasets were randomly divided into learning,

    validation and testing subsets (Alavi et al. 2011). The learning data were taken for training (genetic

    evolution). The validation data were used to specify the generalization capability of the models on

    data they did not train on (model selection). Thus both the learning and validation data were involved

    in the modeling process and were categorized into one group, referred to as training data. The modelswith the best performance on both the learning and validation datasets were finally selected as out-

    comes of the runs. The testing data were finally employed to measure the performance of the models

    obtained by GEP and MEP on data that played no role in building the models. To obtain a consistent

    data division, several combinations of the training and testing sets were considered. The selection was

    such that the maximum, minimum, mean and standard deviation of parameters were consistent in the

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    11/20

    514 M. Divsalar et al.

    training and testing datasets (Alavi et al. 2011). Of the 136 datasets, 102 data vectors were taken for

    the training process (82 sets for learning and 20 sets for validation). The remaining 34 sets were used

    for the testing of the derived models.

    Model construction and analysis using GEP

    The GEP models were developed to obtain explicit relationships for detecting the classes of bankrupt

    and non-bankrupt firms (BC) as follows:

    BCGEP D f .R1,R2,R3,R4/ (12)

    where R1, R2, R3, R4 are the final selected predictor variables. Two GEP models (Models I and II)

    were separately developed using different function sets for the runs. The first function set consisted

    of nearly all functions (Model I). The latter included just addition, subtraction, division, and mul-

    tiplication (Model II) in order to obtain a short and simple equation. Various parameters involved

    in GEP predictive algorithm are shown in Table II. The parameter selection will affect the model

    generalization capability of GEP. They were selected based on some previously suggested values(Ferreira, 2006) and also after a trial-and-error approach. The GEP algorithm was implemented using

    GeneXproTools (Ferreira, 2006; GEPSOFT, 2006; Gandomi et al., 2011). The best GEP models were

    chosen on the basis of a multi-objective strategy as follows (Alavi et al., 2011):

    (i) involving all input variables, although this was not a predominant factor;

    (ii) providing the best fitness value on the learning set of data;

    (iii) providing the best fitness value on a validation set of unseen data.

    In order to evaluate the importance of the input parameters, their frequency values were obtained.

    A frequency value of 1.00 indicates that this input variable has appeared in 100% of the best 30

    programs evolved by GEP.

    Table II. Parameter settings for the GEP algorithm

    Parameters Settings

    BCGEP, Model I BCGEP, Model II

    Function set C, , , =, p, log, sin, cos, tan, exp C, , , =Number of generation 20005000 20005000Number of chromosomes 100 100Number of genes 3 3Head size 8 8Linking function , C , CFitness function error type MAE MAEMutation rate 0.044 0.044Inversion rate 0.1 0.1One-point recombination rate 0.3 0.3Two-point recombination rate 0.3 0.3Gene recombination rate 0.1 0.1Gene transposition rate 0.1 0.1

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    12/20

    A Robust Data-Mining Approach to Bankruptcy Prediction 515

    GEP-based bankruptcy prediction model

    The GEP-based empirical relationships for the prediction of BC are as given below:

    BCGEP, Model I D 19.428R2

    R4R1Ccos .sin .0.173=R1/ C R3/Ccos .sin .R2 2R1/ C R2 C 4.466R3/

    (13)

    BCGEP, Model II D 0.295 0.588R2 0.768R3 CR22 CR2R1 .R1C1.82/.R2 R1 R3/ CR3CR4R3R4

    (14)

    where R1, R2, R3, andR4 are respectively the final predictor variables. The classification results

    obtained by the GEP models are shown in Tables III and IV. The frequency values of the predic-

    tor variables of the models are presented in Figure 7. According to this figure, it can be found that the

    bankruptcy classification is more sensitive to R3 and R4 in comparison with the other inputs.

    Model construction and analysis using MEP

    The MEP-based models were developed using the available database. Two MEP models (ModelsIII and IV) were separately developed using two different function sets. Table V presents various

    parameters involved in the MEP predictive algorithm. The parameter selection will affect the model

    generalization capability of MEP (Alavi et al., 2010). They were selected after a trial-and-error

    approach. For the analysis, the source code of MEP (Oltean, 2004) in C++ was modified to be

    Table III. Classification results obtained by GEP (Model I)

    Predicted class by GEP (Model I)

    Samples Training data Testing data

    Bankrupt Non-bankrupt Bankrupt Non-bankrupt

    Actual class Bankrupt 41 4 19 1Non-bankrupt 3 54 2 12

    Sensitivity (%) 91.11 95.00Specificity (%) 94.74 85.71Positive predictivity (%) 93.18 90.48Accuracy (%) 93.14 91.18

    Table IV. Classification results obtained by GEP (Model II)

    Predicted class by GEP (Model II)

    Samples Training data Testing Data

    Bankrupt Non-bankrupt Bankrupt Non-bankrupt

    Actual class Bankrupt 37 8 16 4Non-bankrupt 6 51 3 11

    Sensitivity (%) 82.22 80.00Specificity (%) 89.47 78.57Positive predictivity (%) 86.05 84.21Accuracy (%) 86.27 79.41

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    13/20

    516 M. Divsalar et al.

    Freq

    uency

    Model I

    Model II

    1.0

    0.0

    0.8

    0.6

    0.4

    0.2

    R1 R4R2 R3

    Figure 7. Contributions of predictor variables in GEP models

    Table V. Parameter settings for the MEP algorithm

    Parameter Settings

    BCMEP, Model III BCMEP, Model IV

    Function set C, , , =, exp, sin, cos C, , , =Population size 5001500 5001500Chromosome length 30 genes 30 genesNumber of generations 250 250Crossover probability 0.5, 0.9 0.5, 0.9Crossover type Uniform UniformMutation probability 0.01 0.01Terminal set Problem inputs Problem inputs

    utilizable for the available problem. A similar procedure to that of GEP was followed to obtain the

    frequency values of the predictor variables. The best MEP models were chosen following the same

    multi-objective strategy considered for deriving the GEP models.

    MEP-based bankruptcy prediction model

    The MEP-based empirical relationships to classify bankrupt and non-bankrupt firms in terms of

    R1,R2,R3, andR4 are as given below:

    BCMEP, Model III D

    cos

    cos

    R4 R1.2R2 C R3 1/

    R1

    R2

    2(15)

    BCMEP, Model IV D 2R4

    .1 2R1R2/2 R3 R2 C 1.5

    (16)

    The classification results obtained by the MEP models are shown in Tables VI and VII. The fre-

    quency values of the predictor variables are presented in Figure 8. As can be seen, the bankruptcy

    classification is more sensitive to R3 and R4 in comparison with the other inputs.

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    14/20

    A Robust Data-Mining Approach to Bankruptcy Prediction 517

    Table VI. Classification results obtained by MEP (Model III)

    Predicted class by MEP (Model III)

    Samples Training data Testing data

    Bankrupt Non-bankrupt Bankrupt Non-bankruptActual class Bankrupt 39 6 18 2

    Non-bankrupt 4 53 2 12

    Sensitivity (%) 86.67 90.00Specificity (%) 92.98 85.71Positive predictivity (%) 90.70 90.00Accuracy (%) 90.20 88.24

    Table VII. Classification results obtained by MEP (Model IV)

    Predicted class by MEP (Model III)

    Samples Training data Testing data

    Bankrupt Non-bankrupt Bankrupt Non-bankruptActual class Bankrupt 38 7 17 3

    Non-bankrupt 5 52 3 11

    Sensitivity (%) 84.44 85.00Specificity (%) 91.23 78.57Positive predictivity (%) 88.37 85.00Accuracy (%) 88.24 82.35

    Freq

    uency

    Model III

    Model IV

    1.0

    0.0

    0.8

    0.6

    0.4

    0.2

    R1 R4R2 R3

    Figure 8. Contributions of predictor variables in MEP models

    COMPARISON OF THE PROPOSED BANKRUPTCY PREDICTION MODELS

    As shown in Tables III, IV, VI, and VII, Model I created by GEP has provided the best performance on

    the training and testing data, followed by Models III and IV of MEP, and Model II of GEP. The perfor-

    mance of the GEP and MEP models on the training and testing data implies that they have both goodpredictive abilities and generalization performance. The performance of the GEP and MEP classifiers

    was further compared with that of a generalized regression neural network (GRNN) (Specht, 1991)

    classifier. GRNN is a variant of the radial basis function (RBF) network. Unlike the standard RBF

    network, the weights of GRNN networks can analytically be calculated. GRNN is a memory-based

    feedforward network based on the estimation of probability density functions (Specht, 1991). The

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    15/20

    518 M. Divsalar et al.

    developed GRNN model had three layers: four input units (R1, R2, R3, R4/ in the input layer, a hidden

    layer with 102 neurons (equal to the number of training data), and an output layer. The MATLAB

    Neural Network toolbox was employed to create the GRNN model.

    In the conventional modeling process, regression analysis is an important tool for building a model.

    In this study, multivariable logistic regression (Logit) and least squares regression (LSR) analyseswere also performed to acquire an idea about the predictive power of the GEP and MEP techniques,

    in comparison to classical statistical approaches. Logit fits linear logistic regression model for binary

    or ordinal response data by the method of maximum likelihood. The LSR method is extensively used

    in regression analysis primarily because of its interesting nature. LSR minimizes the sum-of-squared

    residuals for each equation, accounting for any cross-equation restrictions on the parameters of the

    system. The Logit and LSR models were developed using the total of the learning and validation

    (training) data previously considered for developing the GEP and MEP models. The same testing

    data as GEP and MEP were also used for testing the performance of the regression models. The Logit

    and LSR formulations of BC in terms ofR1, R2, R3, andR4 are as given below:

    BCLogit D 1=.1 C exp..0.3031 1.4178R1 5.7360R2 4.4893R3 C 6.7622R4/// (17)

    BCLSR D 0.2508R1 0.4778R2 0.5093R3 C 0.7423R4 C 0.6089 (18)The classification results obtained by the GRNN, Logit and LSR models for the training and testing

    sets are presented in Table VIII. As can be observed from Tables III, IV, VI, VII, and VIII, the GEP-

    and MEP-based models outperform the GRNN, Logit and LSR models. It is notable that the results

    obtained by the GRNN model for the testing sets are similar to those achieved by Model II created by

    GEP.

    The detection performance of different classifiers on the entire database is visualized by the ROC

    curves (Figure 9). Based on the ROC analysis, Model I evolved by GEP with area index of A equal

    to 0.960 achieves a statistically better performance than the other models. Models III and IV of MEP

    have a similar classification performance considering their comparable A values. As can be seen in

    these figures, the proposed GEP and MEP models perform better than the GRNN, Logit and LSR

    models.Most of the previously reported methods in the literature, such as neural networks and support

    vector machines, have some fundamental disadvantages. Such methods do not provide a certain

    function to calculate the outcome using input values. Hence they do not provide a better understanding

    of the nature of the derived relationship between the input and output data. These approaches are par-

    ticularly appropriate for use as a part of a computer program. Conversely, GEP and MEP provide

    explicit equations that can readily be used for practical applications. Empirical modeling based on

    traditional statistical regression techniques relies on assuming the structure of the model in advance,

    Table VIII. The classification results obtained by the GRNN, Logit and LSR models

    GRNN Logit LSR

    Item Training Testing Training Testing Training Testingdata data data data data data

    Sensitivity (%) 80.00 80.00 77.78 80.00 71.11 70.00Specificity (%) 85.96 78.57 85.96 71.43 80.70 71.43Positive predictivity (%) 81.82 84.21 81.40 80.00 74.42 77.78Accuracy (%) 83.33 79.41 82.35 76.47 76.47 70.59

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    16/20

    A Robust Data-Mining Approach to Bankruptcy Prediction 519

    0

    1

    0 1

    Sensitivi

    ty

    (a)

    Sensitivi

    ty

    (b)

    Sensitivity

    (c)

    0.9

    0.8

    0.7

    0.60.5

    0.4

    0.3

    0.2

    0.1

    0

    1

    0.9

    0.8

    0.7

    0.6

    0.5

    0.4

    0.3

    0.2

    0.1

    0

    1

    0.9

    0.8

    0.7

    0.60.5

    0.4

    0.3

    0.2

    0.1

    0.80.60.40.2 0 10.80.60.40.2

    1 - Specificity 1 - Specificity

    0 10.80.60.40.2

    1 - Specificity

    GEP, Model I (Az = 0.96)

    GEP, Model II (Az = 0.92)

    MEP, Model III (Az = 0.948)

    MEP, Model IV (Az = 0.945)

    GRNN (Az = 0.912)

    Logit (Az = 0.893)

    LSR (Az = 0.89)

    Figure 9. ROC performance evaluation of different classifiers for the bankruptcy prediction

    which may be suboptimal. On the other hand, GEP and MEP have a great ability to directly capture

    knowledge contained in the data without assuming a prior form of the existing relationship. It should

    be noted that the proposed GEP- and MEP-based formulations are valid for ranges of the training set

    used for their development (Alavi et al., 2011).

    However, one of the goals of introducing the expert systems, such as GP-based approaches, into

    the decision process is better handling of information in the preliminary phases. In the initial steps

    of forming decisions, information about the features and properties of targeted output or process

    are often imprecise and incomplete (Kraslawski et al., 1999; Alavi et al., 2011). Nevertheless, it is

    idealistic to have some initial estimates of the outcome. The GEP and MEP approaches are based on

    history of the data alone to determine the structure and parameters of the models (Alavi et al., 2011).

    Thus they are suggested to be treated as diagnostic aids to producing plausible nonlinear models

    and should cautiously be used for final decision making. In any case, the role of financial experts in

    interpretation of the obtained results should not be underestimated.

    CONCLUSIONS

    In the present study, new variants of GP, namely GEP and MEP, were employed to classify samples of

    bankrupt and non-bankrupt Iranian firms. Four effective predictive financial ratios were used as input

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    17/20

    520 M. Divsalar et al.

    variables. These ratios were identified through an extensive bankruptcy prediction literature review

    and upon a feature selection analysis. The GRNN, Logit and LSR-based models were also developed

    to benchmark the GEP and MEP models. Major findings obtained in this research are as follows:

    (i) The proposed GEP and MEP models give reliable estimates of the bankruptcy classification.

    The GEP and MEP models provide superior performance to the GRNN, Logit and LSR models.(ii) Unlike classical statistical methods, GEP and MEP are capable to model the business failure

    without any need to pre-defined equations.

    (iii) According to the frequency values, bankruptcy prediction is more sensitive to the quick assets

    to total assets ratio and the total liability to total assets ratio compared with the other variables.

    (iv) The proposed GEP and MEP models give the user an insight into the relationship between the

    input and output data. An interesting feature of these approaches is the possibility of getting

    more than one prediction model by selecting various parameters and function sets involved in

    their algorithms.

    (v) Another feature of the GEP and MEP methods is the high level of interactivity between the

    user and the methodology. User insight can be used to make propositions on the elements and

    structures of the evolved functions.

    However, the present work showed that the GEP and MEP approaches can be regarded as promising

    tools for their future applications to bankruptcy prediction problems. Further research can be focused

    on both the problem domain and the computing one. As more data become available, including those

    for other corporations, the same models can be improved to make more accurate predictions for

    a wider range. GEP and MEP are robust in the modeling of nonlinear relationships. However, the

    underlying assumption that the inputs are reliable is not always the case. Fuzzy logic can provide a

    systematic method to deal with imprecise and incomplete information. Thus the process of developing

    hybrid fuzzy-GEP and MEP models for the investigated problem could be a suitable topic for further

    studies (Gandomi et al., 2011).

    ACKNOWLEDGEMENTS

    The authors are thankful to Amir Hossein Alavi (Iran University of Science and Technology, Tehran,

    Iran) and Amir Hossein Gandomi (Tafresh University, Tafresh, Iran) for their support and stimulating

    discussions.

    REFERENCES

    Ahn H, Kim K. 2009. Bankruptcy prediction modeling with hybrid case-based reasoning and genetic algorithmsapproach. Applied Soft Computing 9: 599607.

    Aho A, Sethi R, Ullman J. 1986. Compilers: Principles, Techniques, and Tools. Addison-Wesley, Reading, MA.Alavi AH, Gandomi AH, Sahab MG, Gandomi M. 2010. Multi expression programming: a new approach to

    formulation of soil classification. Engineering with Computers 26(2): 111118.Alavi AH, Ameri M, Gandomi AH, Mirzahosseini MR. 2011. Formulation of flow number of asphalt mixes using

    a hybrid computational method. Construction and Building Materials 25(3): 13381355.Altman EI. 1968. Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. Journal of

    Finance 23: 589609.Altman EI. 1993. Corporate Financial Distress and Bankruptcy: A Complete Guide to Predicting and Avoiding

    Distress and Profiting from Bankruptcy (2nd edn). Wiley: New York.Altman E, Avery R, Eisenbeis R, Sinkey J. 1981. Application of classification techniques in business, banking and

    finance Contemporary Studies in Economic and Financial Analysis, Vol. 3. JAI Press: Greenwich, CT.

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    18/20

    A Robust Data-Mining Approach to Bankruptcy Prediction 521

    Banzhaf W, Nordin P, Keller R, Francone F. 1998. Genetic Programming: An Introduction. On the AutomaticEvolution of Computer Programs and its Application. dpunkt/Morgan Kaufmann: Heidelberg/San Francisco.

    Beaver W. 1966. Financial ratios as predictors of failures: empirical research in accounting, selected studies.Journalof Accounting Research (Suppl) 5: 71127.

    Bentz Y, Merunka D. 2000. Neural networks and the multinomial logit for brand choice modelling: a hybrid

    approach. Journal of Forecasting 19: 177200.Charitou A, Neophytou E, Charalambous C. 2004. Predicting corporate failure: empirical evidence for the UK.European Accounting Review 13: 465497.

    Chauhan N, Ravi V, Chandra K. 2009. Differential evolution trained wavelet neural networks: application tobankruptcy prediction in banks. Expert Systems with Applications 36: 76597665.

    Dimitras AI, Zanakis SH, Zopounidis C. 1996. A survey of business failure with an emphasis on prediction methodsand industrial application. European Journal of Operational Research 90: 487513.

    Dimitras AI, Slowinski R, Susmaga R, Zopounidis C. 1999. Business failure prediction using rough sets. EuropeanJournal of Operational Research 114: 263280.

    Divsalar M, Khatami Firouzabadi A, Sadeghi M, Behrooz AH, Alavi AH. 2011. Towards the prediction ofbusiness failure via computational intelligence techniques. Expert Systems (in press). doi: 10.1111/j.1468-0394.2011.00580.x

    Etemadi H, Rostamy AA, Dehkordi HF. 2008. A genetic programming model for bankruptcy prediction: empiricalevidence from Iran. Expert Systems with Applications 36: 31993207.

    Ferreira C. 2001. Gene expression programming: a new adaptive algorithm for solving problems. Complex Systems13: 87129.

    Ferreira C. 2006. Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence (2nd edn).Springer: Berlin.

    Fitzpatrick PJ. 1931. Symptoms of Industrial Failures. Catholic University of America Press: Washington, DC.Foreman RD. 2003. A logistic analysis of bankruptcy within the US local telecommunications industry. Journal of

    Economics and Business 55(2): 135166.Fukuda S, Kasuya M, Akashi K. 2009. Impaired bank health and default risk. Pacific-Basin Finance Journal 17:

    145162.Gandomi AH, Alavi AH, Mirzahosseini R, Moqaddas Nezhad F. 2011. Nonlinear genetic-based models for

    prediction of flow number of asphalt mixtures. Journal of Materials in Civil Engineering, ASCE 23(3): 117.GEPSOFT. 2006. GeneXproTools Owners Manual. Version 4.0. Available: http://www.gepsoft.com/ [9 February

    2011].Jones FL. 1987. Current techniques in bankruptcy prediction. Journal of Accounting Literature 6: 131164.Kahya E, Theodossiou P. 1999. Predicting corporate financial distress: a time-series CUSUM methodology. Review

    of Quantitative Finance and Accounting 13: 323345.Koza JR. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT

    Press: Cambridge, MA.Kraslawski A, Pedrycz W, Nystrm L. 1999. Fuzzy neural network as instance generator for case-based reasoning

    system: an example of selection of heat exchange equipment in mixing. Neural Computing and Applications 8:106113.

    Kumar PR, Ravi V. 2007. Bankruptcy prediction in banks and firms via statistical and intelligent techniques: areview. European Journal of Operational Research 180: 128.

    Lensberg T, Eilifsen A, McKee TE. 2006. Bankruptcy theory development and classification via genetic program-ming. European Journal of Operational Research 169: 677697.

    Li H, Sun J. 2010. Business failure prediction using hybrid2 case-based reasoning (H2CBR). Computers andOperations Research 37: 137151.

    Lin TH. 2009. A cross model study of corporate financial distress prediction in Taiwan: multiple discriminantanalysis, logit, probit and neural networks models. Neurocomputing (in press).

    MathWorks. 2007. MATLAB: the language of technical computing, Version 7.4. Natick, MA.McKee TE, Lensberg T. 2002. Genetic programming and rough sets: a hybrid approach to bankruptcy classification.

    European Journal of Operational Research 138: 436451.Min JH, Lee YC. 2005. Bankruptcy prediction using support vector machine with optimal choice of kernel function

    parameters. Expert Systems with Applications 28: 603614.Ohlson J. 1980. Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research 18:

    109131.

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    19/20

    522 M. Divsalar et al.

    Oltean M. 2004. Multi Expression Programming source code. Available: http://www.mep.cs.ubbcluj.ro/ [9 February2011].

    Oltean M, Dumitrescu D. 2002. Multi expression programming. Technical report, UBB-01-2002, Babes-BolyaiUniversity, Cluj-Napoca, Romania.

    Oltean M, Grossan C. 2003a. A comparison of several linear genetic programming techniques.Advances in Complex

    Systems 14: 129.Oltean M, Grossan C. 2003b. Evolving evolutionary algorithms using multi expression programming. In 7thEuropean Conference on Artificial Life, Banzhaf W, Christaller T, Dittrich P, Kim JT, Ziegler J (eds). Dortmund,1417 September. LNAI 2801. Springer: Berlin; 651658.

    Park C, Han I. 2002. A case-based reasoning with the feature weights derived by analytic hierarchy process forbankruptcy prediction. Expert Systems with Applications 23: 225264.

    Psillaki M, Tsolas IE, Margaritis D. 2010. Evaluation of credit risk based on firm performance. European Journalof Operational Research 201(3): 873881.

    Ravi V, Pramodh C. 2008. Threshold accepting trained principal component neural network and feature subsetselection: application to bankruptcy prediction in banks. Applied Soft Computing 8: 15391548.

    Ryan TP. 1997. Modern Regression Methods. Wiley: New York.Sanchis A, Segovia MJ, Gil JA, Heras A, Vilar JL. 2007. Rough sets and the role of the monetary policy in financial

    stability (macroeconomic problem) and the prediction of insolvency in insurance sector (microeconomicproblem). European Journal of Operational Research 181: 15541573.

    Shin K, Lee Y. 2002. A genetic algorithm application in bankruptcy prediction modeling. Expert Systems withApplications 23: 321328.

    Specht DF. 1991. A generalized regression neural network. IEEE Transactions on Neural Networks 2: 568576.Stone M, Rasp J. 1991. Tradeoffs in the choice between Logit and OLS for accounting choice studies. Accounting

    Review 1: 170178.Theodossiou PT. 1991. Alternative models for assessing the financial condition of business in Greece. Journal of

    Business Finance and Accounting 18: 697720.Vranas AS. 1992. The significance of financial characteristics in predicting business failure: an analysis in the Greek

    context. Foundations of Computing and Decision Sciences 17: 257275.Wu C, Tzeng G, Goo Y, Fang W. 2007. A real-valued genetic algorithm to optimize the parameters of support

    vector machine for predicting bankruptcy. Expert Systems with Applications 32: 397408.Zhou C, Xiao W, Tirpak TM, Nelson PC. 2002. Discovery of classification rules by using gene expression

    programming. In Artificial intelligence (IC-AI02), Las Vegas, NV, pp. 13551361.

    Authors biographies:Mehdi Divsalar has received his BSc in Electrical Engineering from Iran University of Science and Technology,Tehran, Iran. He also received his MSc degree in Industrial Management from Faculty of Management andAccounting, Allameh Tabatabai University, Tehran, Iran. His research interests include Artificial Intelligence,Forecasting, System Dynamics, Data mining, and Application of Operations Research Methodologies.

    Dr. Habib Roodsaz is an Assistant Professor in Allameh Tabatabai University Business School (ATUBS), Facultyof Management, Tehran, Iran. He received BSc and MSc degrees in Business Management from The Univer-sity of Tehran. He received his PhD in Management Information Systems (MIS) from The University of Tehran.Dr. Roodsazs research areas include Strategic Information Systems, Electronic Business, Electronic Commerce,and Strategic Information Systems Planning.

    Farshad Vahdatinia received his BSc degree in Civil Engineering from Department of Civil Engineering, FerdowsiUniversity of Mashhad, Mashhad, Iran. His research interests include Construction Engineering and Management,

    Applications of Artificial Intelligence and Heuristic Optimization Techniques in Construction Management,Time-Cost Trade-off Analysis, Resource optimization in Projects, and Project Economics and Risk Analysis.

    Ghassem Norouzzadeh has received his BSc in Computer Engineering from Sadjad Institute of Higher Education,Mashhad, Iran. He also received his MSc degree in Executive Management from Faculty of Management, Universityof Tehran, Tehran, Iran. His research interests include Artificial Intelligence, Forecasting, System Dynamics, Datamining, Advertisement, Marketing Research, and Application of Operations Research Methodologies.

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for

  • 7/30/2019 A Robast DM Approach to Bankruptcy Prediction

    20/20

    A Robust Data-Mining Approach to Bankruptcy Prediction 523

    Amir Hossein Behrooz received a BSc degree in Industrial Engineering from Iran University of Science andTechnology. He also received his MSc degree in Executive Management from Faculty of Management and Account-ing, Allameh Tabatabai University, Tehran, Iran. He is currently a lecturer at Payame Noor University (PNU). Hisresearch interests include Management Information System, Strategic Management, Problem Solving, and FinancialManagement.

    Authors addresses:Mehdi Divsalar, Habib Roodsaz and Amir Hossein Behrooz, Faculty of Management and Accounting, AllamehTabatabai University, Tehran, Iran.

    Farshad Vahdatinia, Department of Civil Engineering, Ferdowsi University of Mashhad, Mashhad, Iran.

    Ghassem Norouzzadeh, Faculty of Management, University of Tehran, Tehran, Iran.

    Copyright 2011 John Wiley & Sons, Ltd. J. Forecast. 31, 504523 (2012)

    DOI: 10.1002/for