11
IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, VOL. 46, NO. 3, AUGUST 1999 335 Reengineering Claims Processing Using Probabilistic Inductive Learning Ruthra G. Arunasalam, Jill T. Richie, William Egan, ¨ Ozden G¨ ur-Ali, and William A. Wallace, Senior Member, IEEE Abstract— With health care costs in the United States sky- rocketing, and $.25 of every health care dollar being spent on systems and claims administration, technological advances such as electronic claims filing are being advocated as cost-reducing measures. These improvements alone however, will not signif- icantly reduce costs unless they are accompanied by revisions in the entire claims processing system. This study explores the reliability and utility of probabilistic inductive learning (PrIL), a statistically enhanced decision tree algorithm, for improving the decision-making process at the New York State Workers’ Compensation Board (WCB). The WCB sees a high volume of claims every year, and its administrative costs are considerably higher than other states of similar size. In response, legislation had been enacted which would differently route cases with a short expected duration of benefits for nontraditional processing. It is expected that such regulation will shorten the claim life cycle and reduce a backlog of cases at a reduced cost. Using several different models, PrIL was used to generate rules that could be used to assign cases to different processing routes within the WCB system. For purposes of comparison, models using logistic regression analysis and conventional decision tree methodology were also produced. Results indicated that the PrIL algorithm is favorably comparable to both the purely statistical and the classical decision tree methodologies, with the added advantages of easy to understand rules and user- defined reliability measures for each of those rules. Given the appropriate information regarding the relative value of correct and incorrect classification of cases in the WCB system, PrIL can be used to accurately assist in the decision making process in terms of reducing cost, predicting and enhancing quality and case outcomes in managed care practices. Index Terms—Administrative costs, claims processing, concili- ation process, data mining, decision making, decision trees, health care administration, logistic regression, New York State Workers’ Compensation Board, probabilistic inductive learning (PrIL). I. INTRODUCTION T HERE is no doubt that Americans are concerned about health care costs, especially the aging baby boomer pop- ulation. Electronic billing, insurance industry consolidation, Manuscript received March 19, 1997; revised April 1998. Review of this manuscript was arranged by Guest Editor A. Reisman. R. G. Arunasalam is with the Bureau of Medical Management, New York State Workers’ Compensation Board, Albany, NY 12207 USA. J. T. Richie is with the Knowledge Discovery and Data Mining Group, Decision Sciences and Engineering Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180-3590 USA. W. Egan is with the Bureau of Medical Management, New York State Workers’ Compensation Board, Albany, NY 12207 USA. ¨ O. G¨ ur-Ali is with ZS Associates, Evanston, IL 60201 USA. W. A. Wallace is with the Knowledge Discovery and Data Mining Group, Decision Sciences and Engineering Systems, Rensselaer Polytechnic Institute, Troy, NY 12180-3590 USA. Publisher Item Identifier S 0018-9391(99)05831-6. health maintenance organizations (HMO’s), and “managed competition” among private insurers are all measures which have been proposed and advocated to reduce health care expenditures [1]. What these measures often do not address, however, is the relative impact of various aspects of the health care system on health care costs. In most public and private health insurance plans, a formal- ized written claim—filed by the insured individual, treating physician, or hospital—is required before any disbursal of funds can occur. This claims system has become increasingly complicated and elaborate over the years. For many users of the system, particularly the elderly, it may be overwhelming [2]. Health care providers must document care in minute detail, often according to rules which may be confusing and contradictory [3]. Much of this documentation must ultimately be submitted to one (or more) of the approximately 1500 regional or national insurance companies on one (or more) claims forms, which are then reviewed according to internal guidelines before any payments are made. A single mistake on one form can result in delay or refusal of payment. Between 1988 and 1990, over 74 000 new clerical personnel were hired by U.S. physicians to handle patient billing and other administrative tasks; the Mayo clinic alone has 70 staff to handle managed care issues. [1] It is therefore not surprising to learn that, in the United States, $.25 of every health care dollar is spent on administration, with individual states ranging anywhere from 18 to 31% [1], [2]. Further compounding the medical cost issue is the associative indemnity costs, accruing from the 6.2 million nonfatal occupational injuries and illnesses in private industry [25]. As mentioned above, technology in the form of electronic record keeping and claims filing has been added to many health care settings. In many cases, however, electronic record keeping and claims submission efforts are simply replacing ink and paper with computer bytes, with no significant redesign of the claims process [4], [5]. As a result, these innovations will do little to reduce the administrative cost of care unless additional steps are taken to adjust the way claims (electronic or paper) are processed by the insurance providers. We must therefore turn our attention to technology, which can help us examine and reengineer the claims process. II. THE CASE OF THE NEW YORK STATE WORKERS’ COMPENSATION BOARD (WCB) A. Background In the first decade of the twentieth century, work-related injury in the United States had reached an all time high. 0018–9391/99$10.00 1999 IEEE

Reengineering claims processing using probabilistic inductive learning

  • Upload
    wa

  • View
    216

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Reengineering claims processing using probabilistic inductive learning

IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, VOL. 46, NO. 3, AUGUST 1999 335

Reengineering Claims ProcessingUsing Probabilistic Inductive Learning

Ruthra G. Arunasalam, Jill T. Richie, William Egan,Ozden Gur-Ali,and William A. Wallace,Senior Member, IEEE

Abstract—With health care costs in the United States sky-rocketing, and $.25 of every health care dollar being spent onsystems and claims administration, technological advances suchas electronic claims filing are being advocated as cost-reducingmeasures. These improvements alone however, will not signif-icantly reduce costs unless they are accompanied by revisionsin the entire claims processing system. This study explores thereliability and utility of probabilistic inductive learning (PrIL),a statistically enhanced decision tree algorithm, for improvingthe decision-making process at the New York State Workers’Compensation Board (WCB). The WCB sees a high volume ofclaims every year, and its administrative costs are considerablyhigher than other states of similar size. In response, legislationhad been enacted which would differently route cases with ashort expected duration of benefits for nontraditional processing.It is expected that such regulation will shorten the claim lifecycle and reduce a backlog of cases at a reduced cost. Usingseveral different models, PrIL was used to generate rules thatcould be used to assign cases to different processing routeswithin the WCB system. For purposes of comparison, modelsusing logistic regression analysis and conventional decision treemethodology were also produced. Results indicated that thePrIL algorithm is favorably comparable to both the purelystatistical and the classical decision tree methodologies, withthe added advantages of easy to understand rules and user-defined reliability measures for each of those rules. Given theappropriate information regarding the relative value of correctand incorrect classification of cases in the WCB system, PrILcan be used to accurately assist in the decision making processin terms of reducing cost, predicting and enhancing quality andcase outcomes in managed care practices.

Index Terms—Administrative costs, claims processing, concili-ation process, data mining, decision making, decision trees, healthcare administration, logistic regression, New York State Workers’Compensation Board, probabilistic inductive learning (PrIL).

I. INTRODUCTION

T HERE is no doubt that Americans are concerned abouthealth care costs, especially the aging baby boomer pop-

ulation. Electronic billing, insurance industry consolidation,

Manuscript received March 19, 1997; revised April 1998. Review of thismanuscript was arranged by Guest Editor A. Reisman.

R. G. Arunasalam is with the Bureau of Medical Management, New YorkState Workers’ Compensation Board, Albany, NY 12207 USA.

J. T. Richie is with the Knowledge Discovery and Data Mining Group,Decision Sciences and Engineering Sciences, Rensselaer Polytechnic Institute,Troy, NY 12180-3590 USA.

W. Egan is with the Bureau of Medical Management, New York StateWorkers’ Compensation Board, Albany, NY 12207 USA.

O. Gur-Ali is with ZS Associates, Evanston, IL 60201 USA.W. A. Wallace is with the Knowledge Discovery and Data Mining Group,

Decision Sciences and Engineering Systems, Rensselaer Polytechnic Institute,Troy, NY 12180-3590 USA.

Publisher Item Identifier S 0018-9391(99)05831-6.

health maintenance organizations (HMO’s), and “managedcompetition” among private insurers are all measures whichhave been proposed and advocated to reduce health careexpenditures [1]. What these measures often do not address,however, is the relative impact of various aspects of the healthcare system on health care costs.

In most public and private health insurance plans, a formal-ized written claim—filed by the insured individual, treatingphysician, or hospital—is required before any disbursal offunds can occur. This claims system has become increasinglycomplicated and elaborate over the years. For many users ofthe system, particularly the elderly, it may be overwhelming[2]. Health care providers must document care in minutedetail, often according to rules which may be confusing andcontradictory [3]. Much of this documentation must ultimatelybe submitted to one (or more) of the approximately 1500regional or national insurance companies on one (or more)claims forms, which are then reviewed according to internalguidelines before any payments are made. A single mistake onone form can result in delay or refusal of payment. Between1988 and 1990, over 74 000 new clerical personnel werehired by U.S. physicians to handle patient billing and otheradministrative tasks; the Mayo clinic alone has 70 staff tohandle managed care issues. [1] It is therefore not surprisingto learn that, in the United States, $.25 of every health caredollar is spent on administration, with individual states ranginganywhere from 18 to 31% [1], [2]. Further compoundingthe medical cost issue is the associative indemnity costs,accruing from the 6.2 million nonfatal occupational injuriesand illnesses in private industry [25].

As mentioned above, technology in the form of electronicrecord keeping and claims filing has been added to manyhealth care settings. In many cases, however, electronic recordkeeping and claims submission efforts are simply replacing inkand paper with computer bytes, with no significant redesignof the claims process [4], [5]. As a result, these innovationswill do little to reduce the administrative cost of care unlessadditional steps are taken to adjust the way claims (electronicor paper) are processed by the insurance providers. We musttherefore turn our attention to technology, which can help usexamine and reengineer the claims process.

II. THE CASE OF THE NEW YORK STATE

WORKERS’ COMPENSATION BOARD (WCB)

A. Background

In the first decade of the twentieth century, work-relatedinjury in the United States had reached an all time high.

0018–9391/99$10.00 1999 IEEE

Page 2: Reengineering claims processing using probabilistic inductive learning

336 IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, VOL. 46, NO. 3, AUGUST 1999

Despite this fact, injured workers were rarely compensated,as they first had to prove negligence on the part of theiremployers. Such proceedings were time consuming and legalfees were high, so the burden of caring for injured workersfell primarily on charitable organizations. This situation led tothe proposal of workers’ compensation statutes which weredesigned to provide predetermined medical and monetarybenefits and avoid the delay and inequity associated withextensive litigation. In a particularly radical move, it was alsoproposed that the costs of work-related injuries be coveredby employers, not because of any assumed blame on thepart of the employer, but because of the hazardous nature ofindustrial employment itself. This no-fault approach provedto be popular; workers’ compensation laws based on thisprinciple were passed in all but six states between 1911 and1920 [6].

New York State was quick to recognize the need forimproved workers’ compensation legislation. In 1910, the firstworkers’ compensation law based on the no-fault principlewas passed. After a number of legal battles and questionsabout the law’s constitutionality, the legislation was revisedand passed again in 1914. These early laws covered accidentalinjury only. Compensation for certain types of occupationaldisease was not added until 1920; coverage for all occupationaldiseases was enacted in 1935 [6]. The task of workers’compensation in general has grown more difficult over theyears. In the past, the railroad and coal mining industriesproduced the highest rates of worker injury. These days,advances in technology have placed individuals in situationsunimagined by the lawmakers in the earlier parts of thiscentury. Prolonged exposure to chemical or environmentalagents previously unrecognized as hazardous has increased thenumber of diseases considered work related. In addition, asmedicine becomes more adept at treating injury and disease,it also adds to our perception of what may be claimed as awork-related injury [6]. It is estimated that by the year 2000,workers’ compensation will cost employers more than $140billion annually, with medical costs comprising the fastestgrowing component. In 1980, medical costs were about onethird of the total workers’ compensation bill; today they arealmost 40%. It should also be noted that when we speak ofemployers in today’s environment, we are talking about privateinsurers contracted by employers to provide financial benefitsand employers who are self insuring. Approximately 50–60%of workers’ compensation benefits are paid by these privatecompanies. This fact makes workers’ compensation differentfrom other social insurance programs such as unemploymentcompensation, social security, or welfare [7]. Any changesmade in workers’ compensation will necessarily have animpact on both the public and the private sector.

Just as other health care systems are doing, many states areattempting to reduce the medical costs of workers’ compensa-tion by moving to managed care systems, combining workers’compensation coverage with standard medical insurance (com-monly referred to as twenty-four hour coverage), or requiringprecertification for medical treatment [8]. The WashingtonState WCB has found that a small fraction of disabilities witha long duration of benefits accounts for a disproportionate

amount of workers’ compensation benefits. Therefore, theyhave taken steps to stop these cases from entering the systemat all by identifying a number of factors which predict lengthyduration of benefits in order to target subgroups of employeesfor greater injury prevention efforts [9].

Currently, the New York State WCB is responsible forensuring that individuals injured on the job receive the medi-cal treatment and benefits prescribed by law. These benefitsinclude not only medical coverage but monetary compen-sation for temporary loss of earnings and long-term wagereplacement for partial or full disability, facial disfigurement,or death. They are also responsible for keeping records ofoccupational injuries and diseases, for enforcing insurancerequirements, and for resolving claims disputes. The NewYork State WCB handles a tremendous number of claims,more than other large states such as Georgia and Michigan(which are considered to be of similar size in terms ofworkers’ compensation requirements). In 1991 alone, therewere 242 434 cases opened or reopened, 529 940 hearingsor conferences, and 17 587 appeals. These actions resulted inthe disbursal of almost $1.7 billion in workers’ compensationbenefits to injured workers. The cost of claims processing andagency administration in New York State is also unusuallylarge compared to other states. In 1988, for states with overtwo million workers, New York’s cost per worker was morethan 90% above the national average. In 1991, total WCBadministrative expenditures exceeded $67 million, or $8.60per employed person.

One explanation for this vast difference in New York’sadministrative cost is the extended involvement of judges andlawyers in routine claims processing, as well as considerableuse of live medical testimony [10]. This is somewhat ironic,given that part of the motivation for the initial development ofworkers’ compensation legislation was to avoid the protractedand expensive legal intervention that characterized the systemin the nineteenth and early twentieth centuries. In fact, theamount of direct, formal intervention seen in New York isunusual today. Many other states have equally successfulprograms with much less legal scrutiny. Another explanationfor the vast difference in administrative cost is the volume ofpaperwork processed. Compared to other states, the New YorkState WCB is considered by employers and insurers to be quiteburdensome in terms of the number of claims processing formsrequired for adjudication of a case [10]. As in most paper-driven systems, New York State has a substantial backlog; acurrent estimate suggests that this backlog is approximately300 000 cases [11].

B. Problem and Proposal

There is clearly a need to improve the timeliness of claimsprocessing within the New York State WCB system. In ad-dition, in the face of growing benefits costs, it would alsoseem beneficial to introduce measures which would contain oreven decrease New York’s unusually high administrative costs.One option is to differentially process cases where the injuredparty is expected to receive limited benefits [11]. As we haveseen in the Washington State investigation, it is possible to

Page 3: Reengineering claims processing using probabilistic inductive learning

ARUNASALAM et al.: REENGINEERING CLAIMS PROCESSING 337

identify characteristics of a case such as age, sex, or medicaldiagnosis which will predict duration of benefits. In New YorkState, legislation has been enacted by which WCB claimswith expected benefits duration of eight weeks or less willbe handled by a “Conciliation Bureau” rather than the moreformal WCB judicial hearings process, with provisions that acase may be transferred back to the “normal” route if necessary(e.g., when extended benefits are necessary, or the claimantwishes to appeal the Bureau’s decision) [12]. The primaryadvantage of the Conciliation Bureau is the elimination ofthe need for formal judicial review to process and closeroutine claims. The Bureau is staffed by board attorneys toresolve cases administratively at a meeting, scheduled within30 days of notification. However, in cases where claimants areunrepresented by an attorney, agreements between parties areusually reviewed and signed by a judge [10].

Currently, the WCB has no differential tracking policy.However, if differential processing or any similar systemis to be effectively implemented, the WCB needs a quickand reliable method to identify cases which are candidatesfor “conciliation” as they enter the system. Although theNew York State WCB lacks systematic data on key systemoutcomes such as speed of resolving cases, they have madegreat strides over the past few years in monitoring andrecording the progress of cases through the system [10].Large databases containing information about closed casesare being constructed. With hundreds of thousands of caseseach year, the WCB databases are an appropriate candidatefor the application of the large-scale data analysis tools andprocedures collectively referred to as data mining. In general,data mining involves fitting models or uncovering patterns ortrends in large data sets for the purposes of extracting newknowledge and understanding [13], or, to verify a hypothesis[14]. Commonly used techniques include (but are not lim-ited to) decision trees, regression analysis, clustering, neuralnetworks, fuzzy logic, and genetic algorithms.

Consequently, each of the various techniques has its advan-tages and disadvantages which appeal to different users. Nomatter what the source, it is important that predictors usedfor a classification task be consistent and reliable. In addition,the individuals who will eventually use the model to makedecisions must be able to understand what the various criteriamean. Decision trees are a popular tool in data mining becausetheir output is an easy-to-understand rule base [15]. Unfor-tunately, measurements of reliability for traditional decisiontrees are limited to a single global measure, which may not beuseful to the modern decision maker [16]. Statistical methods,in contrast, provide hypothesis testing and calculable measuresof uncertainty and are often more accurate. However, some ofthe assumptions underlying statistical tests may be difficultto meet, and the results of such analyses may be difficultfor laymen to understand [17]. We propose to use a datamining algorithm known as probabilistic inductive learning(PrIL) [16], [18] to develop a classification system which canbe used to identify cases appropriate for Conciliation Bureaurouting. PrIL has been successfully applied in classifyingdelinquent customers for credit collections of a major bank.The induced rules were able to generate approximately 30%

in savings when compared with the bank’s standard operatingprocedures [19]. In addition, PrIL was also evaluated by usingstandard data sets and comparing its performance with otherclassification algorithms and found to be at least as accurateas other tree induction algorithms [16], [20].

The purpose of this paper is to demonstrate the potentialcontribution to reengineer claims processing in the WCB.The following section discusses the PrIL algorithm in detail.Section IV highlights the application of PrIL in the analysisof claims data from the WCB. The results from this analysisare presented in Section V. We conclude by discussing thecost/benefits implication on the WCB claims process.

III. PROBABILISTIC INDUCTIVE LEARNING (PrIL)

PrIL is a decision tree methodology which combines tradi-tional machine learning with statistical techniques to generaterules and quality measures by which to judge those rules.When using any classification system, the user wants to be surethat the system places as many cases as possible in the correctclass. This is a measure of the system’s reliability. PrIL treesare constructed such that each rule meets a minimum, user-specified reliability. In addition, this reliability (i.e., correctclassification probability) can be stated with a prespecifiedlevel of confidence. This methodology specifies a range be-tween 90 and 99%, depending on the minimum number ofcases used to build a rule. This aspect follows closely thework of other decision tree methodologies, which provide theability to reduce errors by providing alternative rule selectioncapabilities. The repeated incremental pruning to produce errorreduction (RIPPER) algorithm [22] is able to reprune each rulesuch that the reliability of the complete rule set is maximized.This enables lower error rates derived from more effective andefficient decision rules. Reliability of individual rules is oftenimportant because, in many applications, misclassificationsamong the various classes do not bear the same cost. Eachcategory into which cases may be classified has different risksand benefits associated with it. For example, a bank’s decisionto deny a loan to a qualified candidate may incur less cost thangiving a loan to a marginal candidate.

The methodology used for this paper incorporates the fol-lowing five primary steps: 1) data collection; 2) exploratorydata analysis; 3) branching; 4) subset elimination; and 5) treeevaluation.

A. Data Collection

This step involves not only the collection or selection ofa data set, but the precise definition of the problem andspecification of domain knowledge. PrIL requires a substantialnumber of training cases with known classes (or outcomes),no missing attributes, and all data in categorical form. Forattributes which are continuous, there are often “cutoff points”used in real-world practice which can be used as a basis forcategorization.

In some instances, there may be no past cases available fortraining. It might be that the outcomes process takes a verylong time to evolve appropriate categories, or past decisionsmay have created a bias against certain types of cases. In

Page 4: Reengineering claims processing using probabilistic inductive learning

338 IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, VOL. 46, NO. 3, AUGUST 1999

these situations, we must rely on the domain knowledge ofexperts to develop our classification rules, keeping in mindthat the results will of course be somewhat biased by theopinions of these experts. Even if an appropriate training setis available, human experts can and often should be used toidentify attributes which could prove to be important in thebranching phase of PrIL.

B. Exploratory Data Analysis

To supplement the knowledge of domain experts in caseswhere the categorization of attributes is unclear, univariateanalysis of attributes using clustering or quantiles can beperformed. Another approach is to divide the data accordingto attribute values such that the association between the valuesand the classification categories is maximized. However, theseunivariate techniques do not take into account any interactioneffects which may be present.

The main effects and interactions of the categorized at-tributes are analyzed using logit modeling. Briefly, logit mod-els are a statistical regression technique used when the depen-dent variable is categorical and the independent variables areeither continuous or categorical. A logit model will return theestimated probability that a given case belongs to one of thetwo dependent groups. It is also possible to determine fromthe model which independent variables are useful or signifi-cant predictors [21]. In this paper, our dependent variable iswhether a case should proceed to the Conciliation Bureau orfollow the regular process, while the attributes defining thecase serve as independent variables in the model.

C. Branching

Both domain knowledge and exploratory data analysis con-tribute to the choice of branching attributes. When this phaseis complete, each branch in the resulting tree will contain aset of cases with the same values for each attribute used forbranching. PrIL branches just enough to account for importantmain effects, while leaving enough cases in each branchfor probabilistic statements to be made during the subsetelimination phase.

D. Subset Elimination

Once branching is complete, the subset elimination phasegenerates classification rules from the training data. However,before rules can be generated, minimum reliability require-ments must be selected. The rules generated by PrIL willultimately be used to guide or make a decision; therefore, therelative costs and benefits of incorrect and correct decisionsmust be determined. As mentioned above, there can be differ-ent costs associated with each class, which leads to differentminimum acceptable reliabilities being chosen for differentcategories. Misclassification costs and correct classificationbenefits can be used to calculate reliability in the followingway: if is the benefit derived from correctly classifying acase and is the cost (or loss) of misclassifying a case intocategory then in order for the classification process to beuseful, the expected benefits should be at least equal to theexpected cost (or losses) in order for the classification process

to be beneficial. We can also express this as a formula. If welet be the reliability level for correctly classifying a case

or

If we want the expected benefits to be greater than the expectedlosses by a given amount, we can amend the formula such that

or

or

Once the reliabilities have been chosen, the followingroutine can be applied to each subset branch: 1) for eachinstance where there are cases remaining and unused attributes,build a rule set using an attribute or combination of attributesand determine its significance; 2) select the rule set whichis most significant and post it; 3) declare the attribute(s)associated with the posted rule set; and 4) eliminate all casescovered by the posted rule set. This process is repeateduntil no cases remain or the attribute pool is exhausted. Thesignificance of an attribute is calculated by combining theresults for all values of that attribute for which minimumreliability is met.

Up to this point we have been discussing the user-definedminimum reliability as if it were an absolute. However, itshould be noted that PrIL can be made to induce rules atthe highest attainable reliability. For example, while the user-selected reliability for a rule may be 0.80, PrIL can be directedto start with a reliability of 0.95 and reduce incrementally onsuccessive generative runs. Specificity of the tree can also bechanged by limiting the number of attributes contained in anyone rule or setting a minimum number of cases covered as arequirement to post a rule.

E. Tree Evaluation

There are a number of criteria by which trees can be judged.First, the rules produced by the training set can be evaluatedon a test set, and the rates of correct and incorrect classificationnoted. PrIL may also leave a subset of training cases whichare not classified by any of the rules as undecided. Twotrees generated using the same training set, reliabilities, andconfidence levels can be compared based on the number ofundecided cases. In general, the number of undecided casesincreases as reliability and confidence levels increase, withreliability exhibiting more of an influence. A high percentageof undecided cases may also result from improper selectionof branching variables or a low number of cases in each leafof the tree.

Reevaluation of an implemented tree can be a routineprocess of comparing the proportion of correct classification

Page 5: Reengineering claims processing using probabilistic inductive learning

ARUNASALAM et al.: REENGINEERING CLAIMS PROCESSING 339

TABLE ICONTINGENCY TABLES RESULTS—CHI-SQUARE STATISTIC

for each rule to a two sigma limit. If the proportion of correctlyclassified cases falls below the lower limit as determined by

where total number of cases in the subset, the reason forthe deviation must be determined and a new tree should beinduced using more recent data.

IV. A PPLYING PrIL TO THE NEW YORK

STATE WORKERS’ COMPENSATION DATA

A. Data Collection

The WCB database on compensated cases closed for 1989,the most recent complete data set available, was used for thisstudy. Information in this data file summarized WCB caseshaving their initial closing with indemnity benefits during the1989 calendar year. The file contained 133 214 case recordswhich included information about case/claimant background,employment, injury/accident characteristics, extent of disabil-ity, indemnity benefits, and selected decision characteristics.An initial random sample of 10% was extracted from the datafile. For the purposes of this study, several characteristics weredetermined to be important in order to ensure that the samplefully reflected the majority of the case record population. Thefollowing characteristics were used to “qualify” a particularcase for the PrIL algorithm.

• Claimant age between 18 and 65 years. Very youngclaimants were excluded since, although they could becompensated under WCB laws, they were consideredillegal workers under New York state law. Older workerswere ignored since they represented an insignificant mi-nority of cases and also made up the majority of deathcases.

• Death or permanent total disability cases were excludedfrom the study.

• Missing value cases, which consisted of cases withoutcompensation duration or benefits, were also excludedfrom the sample population. These cases were processedthrough the WCB system, but denied compensation ben-efits, under the WCB laws.

Sample size after data cleaning was 7856 cases. Next, athreefold method was used to apportion the data into trainingand test data sets. One-third of the sample population was used

for the training data and the balance was used as test data.In the threefold method, three pairs of training and test datasets were derived by extracting various combinations of threelines of data. The end result was that each case belonged toone training set and two tests sets, and this was done to ensurethat results could be generalizeable to the base data file.

Although the database contained a large number of variables(or attributes), it was important to keep in mind that a decisionabout routing would have to be made as soon as the caseentered the WCB system, and as such our model must bebased solely on information available at that time. Discussionswith staff statisticians at the WCB indicated that only a smallsubset of variables would be available and prove instrumentalin classifying a case for Conciliation. These attributes wereage (AGE), gender (SEX), weekly wage (WAGE), occupation(OCCUP), industrial classification code (INDUSTRY), partof body that was injured (POB), and nature of the injury(NATURE). Since the occupation and industry variables werenot independent from the POB and NATURE variables, theseattributes would not be used in further analysis (appropriatechi-square statistics are shown in Table I).

B. Exploratory Data Analysis and Attribute Selection

As mentioned above, the PrIL algorithm accepts only at-tributes which have been categorized. Therefore, the keyattributes were explored carefully to derive appropriate cat-egories to reflect the majority of the cases in the sample.WCB personnel indicated that the combination of injury typeand part of body was important, so cross tabulations wereperformed on these two attributes. These cross tabulationsrevealed 608 POB-NATURE combinations which accountedfor almost 90% of the cases. The POB-NATURE cross cat-egories were then further reduced to derive a smaller subsetthat would still account for the majority of the cases. AGEand WAGE were categorized according to their quartiles, whileSEX was already categorical. Table II shows the final attributecategories.

In order to use the PrIL methodology, one must also decidehow many classes were going to be used to group the cases,and what were the criteria for class membership. In the case ofthe WCB data set, the cutoff limit stipulated in the legislationfor Conciliation Bureau cases is an eight-week duration ofbenefits. In our training set, the median value for duration of

Page 6: Reengineering claims processing using probabilistic inductive learning

340 IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, VOL. 46, NO. 3, AUGUST 1999

TABLE II(a) INDEPENDENTVARIABLES AND VALUE CODES; (b) DEPENDENT VARIABLES

(a)

(b)

benefits was also eight weeks, which would indicate that halfthe cases should ultimately be assigned to each of the tworoutes within the WCB. Although this is clear in retrospect,we have noted that decision on whether to send a case toConciliation or normal routing is based on what informationis available at the time the case enters the system [16]. Giventhese limitations, eight weeks may not be the optimal criterion.We have therefore chosen to model cutoff points of two,four, and six weeks in addition to the legislated eight weeksin order to propose a potentially more efficient course ofaction. Figs. 1–4 show the histograms of compensation forthe four cutoff points. It is apparent that there are markeddifferences in the amount of compensation received, with ahigher proportion of cases receiving smaller amounts in theshorter cutoff points as compared to the enacted eight-weekperiod.

Logistic regression using the categorized attributes was per-formed on the training data to derive the branching variablesfor the sample data (with SPSSPCVer. 4.0). Previous worksuggested that nature of injury-part of body combinations maybe very important in the branching phase [9], [18] and WCBpersonnel confirmed that POB and NATURE play a largerole in the duration of benefits. The results of the regression

analysis did in fact indicate that POB, NATURE, and AGEwere significant predictors of Conciliation or normal routing(see Table III). Therefore, POB and NATURE were selected asbranching attributes. The remaining attributes AGE, WAGE,and SEX were used to generate rules during the subsetelimination phase.

C. Setting Reliability Levels

As stated in the initial problem, the WCB currently has noclassification system in place, so there are no past decisionsto use for modeling the system. Mechanisms are also built inthe process to handle transition of a case from Conciliationto normal routing in the event of an error [12]. This madeit difficult to judge the relative costs of misclassification andappropriate levels of reliability for the two types of misclas-sification possible (Conciliation case classified as normal andnormal case classified as Conciliation). Realistically, even inexploratory analyses, one would want a set of classificationrules to perform better than random or chance classification.Therefore, logic dictated a minimum reliability of 50% forall rules, and we decided to produce a set of trees with thisminimum level of reliability. For purposes of comparison, treeswith 75% reliability were also generated.

D. The Final Model

Overall, eight different PrIL trees were produced, usingminimum reliabilities of 50 and 75%, as well as the fourdifferent classification schemes (less than or equal to twoweeks, greater than two weeks of compensation benefits; lessthan or equal to four weeks, greater than four weeks, less thanor equal to six weeks, greater than six weeks; and less than orequal to eight weeks, greater than eight weeks). The variablesnormally available at the time a case enters the system (age,average weekly wage, occupation, sex, type of accident, partof body, and nature of injury) were investigated as branchingand rule attributes. Each of the eight trees are an average ofthe results of three test subsets.

E. Evaluation of the Model

The results of the PrIL tree rules were first evaluated byclassifying the test samples of data using the rules produced bythe training sets to verify the reliability and generalizeability ofthe trees generated from the 1989 data. We compared rates ofmisclassification for our eight models. Logistic regression wasalso performed for the four different classification schemesin order to compare the results of PrIL to a purely statisticaltechnique. Amongst statistical techniques, logistic regressionhas fewer assumptions than standard regression or discriminantanalysis, which makes it a robust tool [21]. The PrIL outputwas also compared to traditional decision trees generated bythe S-Plus software package. If the classification rates for thePrIL trees are comparable to the rates of classification of thelogistic regression and classical decision tree analyses, we canhave additional confidence in the utility and validity of PrILas a classification technique.

Page 7: Reengineering claims processing using probabilistic inductive learning

ARUNASALAM et al.: REENGINEERING CLAIMS PROCESSING 341

Fig. 1. Histogram of compensation for cases less than or equal to two weeks of benefits.

Fig. 2. Histogram of compensation for cases less than or equal to four weeks of benefits.

V. RESULTS

A. PrIL Trees

The correct classification rates and percentage of undecidedcases for the 50% reliability trees are shown in Tables IV andV. The two-week cutoff for Conciliation cases produced thehighest correct classification rate of 86.10% for the test cases,followed by the four-, the eight-, and the six-week models.Percentage of undecided cases followed the same pattern,with the two-week model producing the lowest number of

undecided cases (4.23%) and the six-week model producingthe largest number (40.15%).

The 75% reliability trees had higher overall rates of correctclassification than the 50% reliability trees. As shown inTable VI, the two-week model was again the best, with a87.80% correct classification rate; correct classification in-creased as the cutoff point increased. Percentages of undecidedcases for these models were high, ranging from 28.37% for thetwo-week test cases to 76.20% for the eight-week test cases(see Table VII).

Page 8: Reengineering claims processing using probabilistic inductive learning

342 IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, VOL. 46, NO. 3, AUGUST 1999

Fig. 3. Histogram of compensation for cases less than or equal to six weeks of benefits.

Fig. 4. Histogram of compensation for cases less than or equal to eight weeks of benefits.

B. Logistic Regression

The regression models were first built using the trainingsets and then applied to the test data sets. Three key vari-ables—nature of injury, part of body, and age—were clearlyidentified as influential variables in all training data sets.The correct classification rates for the four cutoff points areshown in Table VIII. The two-week cutoff gave the highestclassification rate. The legislated eight-week cutoff had thelowest classification rate, indicating that it was not the optimalcutoff point in terms of correctly identifying conciliation cases.

C. S-Plus Trees

Results of the S-Plus analysis are summarized in Table IX.This method produced results similar to both the logistic re-gression and the PrIL trees in terms of the percentage of casescorrectly classified. The highest rate of classification, 85.34%,occurred for the two-week model. The lowest correct classifi-cation rate of 60.44% was observed for the eight-week model,although the six-week model performed only marginally betterat 60.57%. S-Plus, like the logistic regression procedure, doesnot produce an undecided or unclassified pool of cases.

Page 9: Reengineering claims processing using probabilistic inductive learning

ARUNASALAM et al.: REENGINEERING CLAIMS PROCESSING 343

TABLE IIILOGISTIC REGRESSIONRESULTS—WALD’ S STATISTIC

TABLE IVPERCENTAGECORRECT(a) CLASSIFICATION RATES UING PrIL (50% RELIABILITY )

TABLE VPERCENTAGE UNDECIDED(b) RATES USING PrIL (50% RELIABILITY )

TABLE VIPERCENTAGE CORRECT(a) CLASSIFICATION

RATES USING PrIL (75% RELIABILITY )

TABLE VIIPERCENTAGE UNDECIDED(b) RATES USING PrIL (75% RELIABILITY )

TABLE VIIIPERCENTAGE CORRECT(b) CLASSIFICATION RATES USING LOGISTIC REGRESSION

TABLE IXPERCENTAGE CORRECT(b) CLASSIFICATION RATES USING S-PLUS

VI. CONCLUSIONS

A. Discussion

The results of the PrIL analysis indicate that the 50%reliability tree with the two-week cutoff point is optimal interms of maximizing correct classification and minimizingundecided cases, while the eight-week cutoff point imposedby the legislature had the lowest classification of the fourmodels. PrIL results performed better than logistic regressionand the traditional trees generated by S-Plus in terms ofcorrectly classifying cases (neither method however producesunclassified cases). Although in general, statistical techniquestend to perform better than decision tree algorithms, our resultsare not dissimilar. PrIL has the added advantage of derivinga rule file which can be easily translated into English-likeIF–THEN–ELSE statements. Such rules are more intuitivefor managerial decision making. S-Plus can also providesimilar rules but without PrIL’s explicit rule-by-rule reliabilitymeasures. The overall similarity between the results of thesethree classification models supports the validity and reliabilityof PrIL as a classification tool.

The 75% reliability PrIL trees had higher rates of correctclassification than any of the other techniques and also veryhigh percentages of undecided cases. Consequently, the ratesof misclassification were also lowest for these models. Thegeneral pattern of results were also similar to those discussedabove, with the two-week model achieving the highest rate ofcorrect classification and the eight-week model producing thelowest rate. In many situations, these models may actually bepreferable if the cost of misclassification is significantly greaterthan the cost of the additional work necessary to partition theunclassified cases.

In the case of the WCB, if the cutoff were set at two weeks,approximately 14% of cases would qualify for conciliationrouting. This may not be a sufficient number of cases toachieve significant reductions in cost and processing time.The total compensation benefits paid for these cases accountsto less than 0.01% of the total indemnity benefits paid outto claimants. More information about the costs of processingindividual cases and the tradeoffs between various types ofmisclassification (i.e., normal to conciliation or vice versa)

Page 10: Reengineering claims processing using probabilistic inductive learning

344 IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, VOL. 46, NO. 3, AUGUST 1999

as well as a decision on how to route unclassified cases areneeded to choose the model which will best meet the needsof the WCB. In addition, as more compensated closed casesdatabases are completed (currently, 1989 is the most recentcomplete database available), the models can be updated toreflect more current information.

The present form of the PrIL methodology requires thatindependent variables have to be categorized into a finitenumber of categories. Since the current classification codesused by the WCB had a number of categories with veryfew records of claims, we selected the appropriate POB andNATURE categories that encompassed approximately 95% ofthe common work related injuries. PrIL uses two variablesto branch on. As we have noted, three variables contributedsignificantly to the logistic regression model—nature of injury,part of body injured, and age. Capturing the full impact ofthese variables would induce a more realistic partitioning ofthe trees, since it was found that a case for conciliation is notonly determined by the type of injury suffered and the part ofbody that was affected, but also by the age of the claimant.A case in point would be a claimant with a back sprain, whocould be easily classified as a conciliation candidate. However,an older claimant with a back injury should be rightly classifiedas a nonconciliation case. Ongoing research is investigatingintegrating neural networks with PrIL to summarize the dataprior to branching.

If a computerized screening or decision-making algorithmwere implemented by WCB, it could easily be integrated witha computerized claims filing system. Submitted claims couldbe run through a “filter” which would automatically route themto the Conciliation Bureau or normal processing. Refinementsin the PrIL program and improvements in large-scale datastorage technology may also allow automatic updating of thedecision criteria whenever a new closed compensated case fileis completed, giving managers the power to reengineer thesystem to improve efficiency.

B. Cost/Benefit Implications

As previously noted, the objective of this project was to re-duce administrative costs by reengineering claims processing.If our analysis is successful, the cost of handling as well asthe time required to identify and expedite conciliation eligiblecases would be greatly reduced.

In order to assess the potential fiscal and operational ben-efits, we used the 50% reliability PrIL tree on an annualworkload of between 225 000 and 250 000 cases. This work-load is based on a conservative estimate of approximately 1200cases which would be evaluated daily in order to determinetheir potential for the conciliation process. Initial identificationof this conciliation process is important since the time it takesto process a claim can be reduced by 35–45% (or six to eightmonths), while affording significant reduction in costs both inback office processing as well as at the legal hearing or reviewlevel. From this annual workload, between 80 000 and 100 000cases may ultimately qualify for conciliation. If the WCB wereable to identify such cases on an incoming case basis, backoffice net process savings of about $385 000 would be realized

(basing on a 60% correct classification rate and a $7 per-casesaving achieved by not handling a case twice).

In addition, a portion of those cases (approximately 10%)are inadvertently sent through the initial phases of the regularhearing process which increases the processing cost for eachcase by $30 or an estimated total of $240 000. A spinoffbenefit to the injured worker whose case is closed by means ofthe conciliation process would be a reduction in the numberof hearings an individual would need to attend in order toresolve outstanding issues. These issues would be resolved atone predetermined meeting. This results in less lost wage timefor many injured workers as well as a $100 to 200 reductionin attorney fees, which are chargeable to the worker’s awards.

These savings translate to approximately $0.5 million, con-sidering the $100 000 investment in the initial software devel-opment phase. Yearly savings in excess of $0.6 million can beconservatively estimated since other systemic benefits wouldalso accrue to workers’ compensation risk bearers (carriersand self-insured employers) for their back office and hear-ing related costs. These costs, which represent a substantialpercentage of WCB’s costs, have yet to be quantified.

Another nonquantifiable, yet important, aspect in apply-ing PrIL for reengineering claim process concerns servicequality. We can only theorize as to the effect of PrIL to-ward claimant (worker) satisfaction, in terms of the overalleffect of this new service delivery mechanism for claimsprocessing—fewer face-to-face encounters with bureaucraticprocedures and fewer trips to the WCB. We can say that theworker would be more satisfied with rapid resolution of thecase, notwithstanding the outcome of the case itself.

REFERENCES

[1] I. Hellander, D.U. Himmelstein, S. Woolhandler, and S. Wolfe, “Healthcare paper chase 1993: The cost to the nation, the states, and the Districtof Columbia,” Int. J. Health Services, vol 24, no. 1, pp. 1–9, 1994.

[2] E. B. Glick, “A patient’s paper chase,”Western J. Med., vol. 162, no.2, p. 177, 1995.

[3] L. E. England, “Swimming in a pool of paper,”Mississippi State Med.Assoc., vol. 36, no. 5, pp. 143–144, 1995.

[4] I. Shulman, “Electronic claims,”J. Amer. Dental Assoc., vol. 127, no.4, p. 424, 1996.

[5] D. Kaslow, “More practices are filing electronic claims,”Dental Econ.,vol. 84, no. 11, pp. 32–43, 1994.

[6] Workers ’ Compensation Digest of Laws. New York: Workers’ Com-pensation Board, 1994.

[7] T. Thomason, “Correlates of workers’ compensation claims adjustment,”J. Risk Insurance, vol. 61, no. 1, pp. 59–78, 1994.

[8] P. T. Shultz and K. L. Looram, “Managing workers’ compensationmedical costs: A state-by-state guide,”Employee Relations, vol 19, no.3, pp. 251–267, 1993–1994.

[9] A. Cheadle, G. Franklin, C. Wolfhagen, J. Savarino, P. Liu, C. Salley,and M. Weaver, “Factors influencing the duration of work-relateddisability: A population-based study of Washington State Workers’Compensation,”Amer. J. Public Health, vol. 84, no. 2, pp. 190–196,1994.

[10] D. D. Ballantyne and C. A. Telles,Workers’ Compensation in New YorkState. Cambridge, MA: Workers’ Compensation Res. Inst., 1992.

[11] F. O. Gur, “The use of exploratory data analysis to develop decisionrules for claims classification at a workers’ compensation board,”Master’s thesis, Dep. Decision Sci. Eng. Syst., Rensselaer PolytechnicInst., Troy, NY, 1990.

[12] O. C. Dais, “Summary of the Benefits Bill (S454-D),” General CouncilOffice to Chairwoman Patton, Board Members, Bureau Heads, andLegislative Committee, 1990.

[13] U. Fayyad, G. Piatetsky-Shapiro, and G. Smyth, “The KDD process forextracting useful knowledge from volumes of data,”Commun. ACM,vol. 13, no. 11, pp. 27–34, 1996.

Page 11: Reengineering claims processing using probabilistic inductive learning

ARUNASALAM et al.: REENGINEERING CLAIMS PROCESSING 345

[14] R. J. Brachman, T. Khabaza, W. Kloesgen, G. Piatetsky-Shapiro, andE. Simoudis, “Mining business databases,”Commun. ACM, vol. 39, no.11, pp. 42–48, 1996.

[15] J. R. Quinlan, “Decision trees and decision making,”IEEE Trans. Syst.,Man, Cybern., vol. 20, pp. 339–346, Feb. 1990.

[16] F. O. Gur Ali and W. A. Wallace, “Induction of rules subject to a qualityconstraint: Probabilistic inductive learning,”IEEE Trans. KnowledgeData Eng., vol. 5, pp. 979–983, June 1993.

[17] C. Glymour, D. Madigan, D. Pregibon, and P. Smyth, “Statisticalinference in data mining,”Commun. ACM, vol 39, no. 11, pp. 35–41,1996.

[18] F. O. Gur Ali, “Probabilistic inductive learning: Induction of decisionrules with reliability measures for decision support,” Ph.D. thesis, Dep.Decision Sci. Eng. Syst., Rensselaer Polytechnic Inst., Troy, NY, 1994.

[19] F. O. Gur Ali and W. A. Wallace, “Classifying delinquent customers forcredit collections: An application of probabilistic inductive learning,”Int. J. Human-Comput. Studies, vol .42, no. 6, pp. 633–646, 1995.

[20] , “Bridging the gap between business objectives and parametersof data mining algorithms, decision support system,”Decision SupportSyst. (Special Issue on Data Mining), vol. 21, no. 1, pp. 3–15, 1998.

[21] J. Neter, M. H. Kutner, C. J. Nachtsheim, and W. Wasserman,AppliedLinear Regression Models. Chicago, IL: Irwin, 1990.

[22] W. W. Cohen, “Fast effective rule induction,” inMachine Learning:Proc. 12th Int. Conf. Lake Tahoe, CA: Morgan Kaufmann, 1995, pp.115–123.

[23] M. J. Norusis,SPSS Advanced Statistics Student Guide. Chicago, IL:SPSS, Inc., 1990, pp. 119–146.

[24] D. W. Hosmer and S. Lemeshow,Applied Logistic Regression. NewYork: Wiley, 1989.

[25] Bureau of Labor Statistics, U.S. Dep. Labor. (1996). Industry injuryand illness data, safety and health statistics. [Online]. Available WWW:http://stats.bls.gov/special.requests/ocwc/osh/os/osnr0005.txt.

Ruthra G. Arunasalam received the Bachelor’s de-gree in agribusiness from University Putra Malaysia,and the Master’s degree in information systems andthe Master’s degree in management from StirlingUniversity, Stirling, U.K. He is currently pursuingthe Ph.D. degree in management at the Lally Schoolof Management and Technology, Rensselaer Poly-technic Institute, Troy, NY.

He has ten years experience in teaching, softwaredevelopment, and conducting research at a for-eign university. He is presently a Project Associate

in the Bureau of Medical Management at the New York State WorkersCompensation Board. His research interests include the application of datamining technologies to workers compensation health care delivery systems,evaluating performance metrics, service quality, and access to health careusing telemedicine networks.

Dr. Arunasalam is a member of INFORMS.

Jill T. Richie received the B.A. degree in psychol-ogy from Harvard University, Cambridge, MA, andthe M.S. degree in industrial/organizational psychol-ogy from Rensselaer Polytechnic Institute, Troy,NY. She is currently pursuing the Ph.D. degreein decision sciences, with a focus in statistics, atRensselaer Polytechnic Institute.

Prior to receiving the M.S. degree, she worked ona research project examining the impact of managedcare on the health status and patient satisfaction ofelderly Veterans with the VA system. Her current

research interests include multivariate statistics, analysis of rank order data,and statistical education.

Ms. Richie is a member of the American Statistical Association, theAmerican Psychological Association, and INFORMS, where she is currentlyserving as the Rensselaer Student Chapter President.

William Egan received the Bachelor’s degree inmanagement from Rensselaer Polytechnic Institute,Troy, NY, in 1967.

He is presently the Director of the Bureau ofMedical Management at the New York State Work-ers’ Compensation Board (WCB). In this role, heis responsible for coordinating activities relatingto research, education, customer interaction, andthe implementation and monitoring of programsfocused on quality of care delivery, as representedby managed care organizations, preferred provider

networks, and voluntary programs to injured workers in the state of NewYork. He has worked with the WCB since 1978. His previous positions atWCB include Director of Workers’ Compensation Compliance and RegulatoryServices (1990–1998), Director of WC Finance and Policy (1985–1990), andDirector of WC Accounts (1978–1985).

Ozden Gur-Ali received the Ph.D. degree fromthe Decision Sciences and Engineering SystemsDepartment, Rensselaer Polytechnic Institute, Troy,NY, in 1994.

She worked for General Electric Research andDevelopment for four years developing new ap-proaches to industry problems for medical, plastics,broadcasting, and financial industries using datamining. Currently, she is a Consultant at ZS As-sociates, Evanston, IL, focusing on applying datamining to health care and pharmaceutical indus-

tries, especially in marketing and sales areas. Her research interests includedata mining algorithms, design of experiments for data mining algorithms,and the hybrid use of algorithms. She has ten refereed publications onthe applications of data mining methods, including PrIL, in journals suchas IEEE TRANSACTIONS ON NEURAL NETWORKS, Decision Support Systems,International Journal of Human–Computer Studies, IEEE TRANSACTIONS ON

KNOWLEDGE AND DATA ENGINEERING, andCommunications in Statistics: Sim-ulation and Computation. She has also published in the Knowledge Discoveryand Data Mining (KDD) and CONRAD conferences.

William A. Wallace (M’90–SM’96) received theBachelor’s degree in chemical engineering from theIllinois Institute of Technology, Chicago, in 1956and the Master’s and Ph.D. degrees in managementscience from Rensselaer Polytechnic Institute, Troy,NY, in 1961 and 1965, respectively.

He is Professor of Decision Sciences and Engi-neering Systems at Rensselaer Polytechnic Institute,Troy, NY. He has over 20 years experience in re-search and development in management science anddecision support systems, with particular emphasis

on crisis management. He was recently appointed Vice-Chair of the NationalResearch Council’s Committee on Advanced Information Technology for theMaritime Industry. He is presently engaged in research on computer-baseddecision aids for emergency managers and the process of modeling.