Detection of Health Insurance Fraud with Discrete Choice ... · PDF fileModel: Evidence from Medical Expense Insurance in China ... that health insurance fraud costs up to $80

1

Detection of Health Insurance Fraud with Discrete Choice

Model: Evidence from Medical Expense Insurance in China

Abstract：

Health insurance fraud increases the inefficiency and inequality in our society. To

address the widespread problem, cost effect techniques are in need to detect fraudulent

claims. With a dataset from medical expense insurance in China, we propose a discrete

choice model to identify predicting factors of fraudulent claims, and we address the

major limitations of discrete choice model by considering over sampling of fraudulent

cases, as well as mislabeling of legitimate claims (omission error). Our results show

that a few factors, such as hospital’s qualification and policyholder’s renewal status,

could be used to predict fraudulent claims for further investigation.

Key words:

Medical expense insurance, insurance fraud, discrete choice model, predicting factor,

omission error

2

1 Introduction

Health insurance is a critical mechanism for financing healthcare need in a modern

society. Health insurance fraud comes as an unwanted byproduct, contributing to rising

health insurance costs and resulting in significant social welfare loss. According to

the Global Health Care Anti-fraud Network (GHCAN), health insurance fraud has

become a worldwide problem suffered by both developed countries with sophisticated

healthcare systems and developing countries with emerging health insurance markets.

Globally, it is estimated that the annual total cost of health insurance fraud could reach

$260 billion, or 6% of global healthcare spending.1 In the U.S. alone, it is estimated

that health insurance fraud costs up to $80 billion annually, accounting for 3% of the

annual national health care spending. 2 In an emerging market such as China,

commercial health insurance market is still at a nascent stage in terms of premium

income 3 , but fraud is already wide-spread, causing losses equal to 10%-30% of

premium income (Mao, 2008; Munich Re, 2013).4 The China Insurance Regulatory

Commission (CIRC) estimated the growth rate of insurance fraud cases was around 20%

in 2011, and in response to this rising problem CIRC proposed to build its own

insurance anti-fraud system in 2012.5

Multiple stakeholders should be involved to detect fraudulent claims effectively and

accurately, including academia, the insurance industry, regulatory institutions and

international organizations such as the GHCAN. In a developed market, all

stakeholders work coherently and develop an advanced fraud detection system using

abundant data and predictive analytics to provide efficient fraud management.6 In an

emerging market such as China, the typical procedure to detect health insurance fraud

still follows simple guidance criteria such as claim amount threshold, and then largely

relies on the experience and skill of an individual claim adjuster to perform a manual

investigation. Both the efficiency and accuracy could be improved dramatically with

an automated fraud detection system. Despite the urgent need, as far as we know,

there has been no study focused on health insurance fraud in China yet. We attempt to

1 http://www.ghcan.org/challenge.html 2 http://www.fbi.gov/about-us/investigate/white_collar/health-care-fraud 3 In 2012, the premium income of health insurance in China is 86.3 billion Yuan ($14 billion), comprising merely

8% of the total life insurance premium. 4 http://www.docin.com/p-718280146.html 5 http://www.circ.gov.cn/tabid/5171/InfoID/219312/frtid/5225/Default.aspx 6 http://www.fico.com/en/products/fico-insurance-fraud-manager-health-care-edition/

3

fill this gap and provide evidence on contributing factors in predicting health insurance

fraud in this emerging market.

We develop our hypotheses and theoretical background in the following, then we

present our data and the empirical models, as well as discuss the results. We present

the concluding remarks in the end.

2 Theory and Hypotheses Development

2.1 Overview on Fraud Detection Methodology

The methods of detecting insurance fraud fall largely into two groups. The supervised

learning methods make use of prior information on the dependent variable (fraudulent

or legitimate) in a training subset of data to obtain patterns in predicting variables. Some

examples of supervised learning methods include discrete choice models (Artis et al.,

1999, 2002; Belhadji et al., 2000; Caudill et al., 2005), other standard econometric

models (Weisberg and Derrig, 1991, 1995, 1998), the expert system (Major and

Riedinger, 2002; Stefano and Gisella, 2001), as well as active learning and cost-

sensitive learning methods. Unsupervised learning methods do not rely on

predetermined status of dependent variable but extract information from the predicting

variables directly. Some examples include cluster analysis, unsupervised neural

network (Brockett et al., 1998) and other data mining methods (Kou et al., 2004; Yamanishi,

2004).

Compared to unsupervised methods, supervised methods tend to be more accurate since

additional information on dependent variable is employed in the training sample. But

the major limitations are: first, it could difficult and (or) costly to obtain “labels” for

training sample; second, due to the nature of fraud, unbalanced data (too few fraudulent

cases compared with legitimate ones) is almost inevitable and requires specific

treatment; third, the labeling of dependent variable could be inaccurate

(misclassification problem).

In our study, we obtained a dataset with prior information of whether the claim is

fraudulent, therefore the choice of using a supervised learning method is natural.

Among different supervised methods, we choose discrete choice model. It’s

4

straightforward to use, and the results could be easily interpreted. In addition, we used

weighted exogenous sampling maximum likelihood estimation to address the

oversampling of fraudulent claims in our sample, and further consider omission error

to address the inaccuracy of predetermined labelling of dependent variable.

2.2 Literature Summary on Health Insurance Fraud Predicting Indicators

In the area of detecting insurance fraud, various methods are applied in different lines

of products as shown in Table 1.

Table 1 Summary of relevant literature

Methodology supervised learning methods unsupervised learning

methods

Derrig (2002)

Hausman et al. (1998)

Li et al. (2008)

Manski and Lerman (1977)

Brockett et al. (1998)

Kou et al. (2004)

Yamanishi (2004)

Empirical

auto

insurance

Artis et al. (1999)

Artis et al. (2002)

Belhadji et al. (2000)

Caudill et al. (2005)

Derrig and Ostaszewski (1995)

Stefano and Gisella (2001)

health

insurance

He et al. (1997)

Liou et al. (2008)

He et al. (2000)

Liou et al. (2008)

Major and Riedinger

(2002) Ortega et al. (2006)

Shin et al. (2012)

Yamanishi et al. (2004) Yang and Hwang (2006)

other lines

(BI in auto)

Viaene et al. (2002)

Weisberg and Derrig (1990,

1993, 1995, 1998)

Ai et al. (2009)

Brockett et al. (2002)

While there are a series of empirical studies on insurance fraud in auto lines (for either

property damage or bodily injury claims) (Artis et al., 1999; Brockett et al., 2002;

Caudill et al., 2005; Derrig and Ostaszewski, 1995), scholars start to present findings

in health insurance as data becomes available (He et al., 1997, Liou et al., 2008; Major

and Riedinger, 2002; Yamanishi et al., 2004). As suggested by Li et al. (2008), due to

legal issues or concerns over privacy protection, the papers presenting details on

indicators for health care fraud is scarce.

5

Most of the existing studies employ unsupervised learning methods. Major and

Riedinger (1992) analyzed Electronic Fraud Detection (EFD) using by an insurance

company, and it provides a general framework for health insurance fraud indicator

classification, and it includes five categories, i.e. financial indicators, medical logic

indicators (whether a medical situation would normally happen), abuse indicators

(frequency of treatment), logistics indicators (the place, timing and sequences of

activities) as well as identification indicators (the way providers present information).

In specific for health fraud committed by medical laboratory, Yamanishi et al. (2004)

used outlier detection method to identify the test categories (chemical, microbiology,

and immunology) distribution, the number of different patients, and the test frequency

as potential indicators to detect fraud.

Regarding abusive utilization in outpatient clinics, Shin et al. (2012) uses a scoring

model to detect outpatient abusive billing patterns using profiling information extracted

from electronic insurance claims in South Korea. They rely on domain experts to

generate an index to decide whether further investigation is warranted. It includes

measurement of various charges composition (total utilization, medications, injections,

laboratory tests, and diagnostic radiology), total charges for the five most frequent

diagnoses, rates of utilization of specific services (antibiotics and corticosteroids),

utilization of visits and prescription drugs.

Similar to Shin et al. (2012), Liou et al. (2008) also takes healthcare provider as the unit

to examine its fraudulent medical claims, and it uses three different approaches

including logistic regression, neural network and classification trees. It uses nine

variables including average days of drug dispense, average drug cost, average

consultation and treatment fees, average diagnosis fees, average dispensing service fees,

average medical expenditure, average amount claimed, average drug cost per day, and

average medical expenditure per day, and it finds eight out of the nine variables being

significant predicting ones.

Our study is different from the previous literature in three ways. First, we adopt more

sophisticated discrete choice model to detect medical fraud and this type of method was

6

not frequently used before. Second, we are the first to focus on product providing

inpatient medical expense insurance in China. Third, we focus on indicators of

individual fraudulent behavior (insured) rather than institutional behavior (healthcare

provider), therefore our results could provide more informative inference for insurer.

2.3 Hypotheses Development for Specific Indicators

We choose characteristics on healthcare provider and service (type of hospital, number

of days in hospital this time and previously under this product, total cost, and

composition of total cost across bed charge, medicine, care, diagnosis, treatment,

operation and lab test) and characteristics on policy (coverage, renew status, claim

duration, file duration and previous claim frequency etc.) as our fraud indicators,

controlling for demographics of the insured (sex, age, occupation, marital status, and

income).

In specific, we hypothesize that:

a. A few variables defining the nature of hospital are predictive of medical fraud.

The type/ranking of hospital could be predictive of fraud. Those lower ranked

community clinics could be networked more easily, therefore prone to fraudulent

behavior compared to the national wide top hospital (ranked III-A).

In addition, if a hospital is qualified provider under the insurance contract, the

probability of fraud would decrease. Furthermore, if the policyholder seeks service

from a recommended provider, the probability of fraud should also decrease.

b. The number of days stayed in hospital and total cost for current stay.

These two variables are dependent to each other to some extent. We hypothesize

that as the patient spends more days in hospital, or has a larger bill for the stay, it’s

a likely signal for fraudulent behavior. As these signals draw attention of claim

adjuster, there is a higher probability of fraudulent behavior being discovered.

c. Composition of cost.

The total cost is consisted of seven categories including bed charge, medicine

charge, diagnosis charge, treatment charge, test charge, operation charge and charge

for care (labor) delivered. If one or a few categories are dominant in the total cost,

it could be a potential signal of fraudulent claim.

d. Coverage type.

7

If it’s a planned fraud, the fraudster may tend to purchase policy with higher limit

and more comprehensive coverage, therefore, we hypothesize there is a positive

correlation between coverage type and fraud.

e. Renewal status, number of days stayed in hospital in previous claims, and number

of claim filed previously.

These three variables indicates the history of a given insured with the product. We

hypothesize that if it’s a renewed customer, it’s less likely to commit fraud.

Furthermore, if the insured filed claims previously, then he/she had undergone claim

auditing before, therefore diminishing the probability of fraud.

f. Number other policies with the same company.

We hypothesize that if the customer bought other policies (such as auto insurance),

then it’s less likely to commit fraud, because information gathered from other

policies could be used by insurer in claim auditing.

g. Self-claim preparation.

If a claim is filed and materials being prepared by insured himself/herself, we

hypothesize the probability of fraud would diminishing.

h. Claim duration.

It’s the number of days between policy commencement to hospitalization. If it’s a

planned fraud, the fraudster tends to shorten the claim duration, therefore there is a

negative correlation between claim duration and fraud.

i. File duration.

It’s the number of days between hospitalization and submission of complete claim

files. For fraudulent claims, it might take longer to get forge the material resulting

in a positive correlation between file duration and fraud.

3 Data

3.1 Medical Expense Insurance Fraud in China

There are three main types of health insurance products in China, namely medical

expense insurance, critical illness insurance and accident insurance with health expense

coverage. We chose medical expense insurance as our target product because it is the

dominant health insurance product, and the fraud is more prevalent compared to the

other two products. We obtained data of an individual inpatient medical expense

8

insurance product from a leading health insurance company in China. Insured aged

between 28 days and 59 years old are eligible to purchase this product. It is designed

with three levels of coverage, with the premium depending on age, gender and coverage

level. The coverage limits in various sub-categories are described in Table 2. There is

no deductible, and the copayment percentage is 20%. An additional coverage of 5%

of the medical expense claim payoff is provided if the insured seeks healthcare from a

recommended hospital.

Table 2 Insurance coverage for individual medical expense insurance

Low

coverage

Medium

coverage

High

coverage

Insurance

Coverage

Medical

expense

coverage

in sub-

categories

(in Yuan)

Bed charge

Average daily

limit 50 80 100

Total limit 4,500 7,200 9,000

Medicine

charge

Average daily

limit 100 150 200

Total limit 9,000 13,500 18,000

Care charge 200 500 900

Diagnosis charge 200 500 900

Treatment charge 1,500 3,000 4,500

Lab charge 2,000 4,000 6,000

Operation charge 2,000 4,000 6,000

Additional coverage (in Yuan) Additional 5% of medical expense

claim payoff

There exists a range of definitions for health insurance fraud, from hard fraud in the

form of criminal actions to soft fraud in the form of over-utilization or over-estimation

of existing expense (Ai, et.al, 2009). In this product, the major types of fraud include

concealing a pre-existing condition, forgery of medical expense receipts and documents,

as well as inflating days of inpatient service. There is virtually no consensus on the

definition of insurance fraud in the existing literature. We use the insurer’s decision as

a proxy of insurance fraud in model 1 and adjust for the insurer’s omission error in

model 2.

9

3.2 Sample Selection

We obtained data of all claims filed in 2009 and 2010 for this inpatient medical expense

insurance product. It is divided into two categories, zero payoff and non-zero payoff,

according to insurer’s claim decision. We treat zero claim payoff as definite evidence

for the existence of fraud. The non-zero payoff claims could be further divided into

fully paid (adjusted to copayment and coverage limit) and partially paid claims.

However the majority of the partially paid claims are due to the deduction of payment

from the social medical insurance program, so it would be unfair to label them as fraud.

Therefore we treat all partially paid claims as legitimate claims in our analysis, and only

regard zero claim payoff as fraud cases.

Table 3 Summary of total claims and sampled claims

Year Total #

of claim

Fraudule

nt cases

Fraud % in

population

Sampled # of

claim

Sampled

Fraudulent

cases

Fraud % in

sample

2009 3,868 236 6.10% 451 155 34.37%

2010 4,205 255 6.06% 512 224 43.75%

Overall 8,073 491 6.08% 963 379 39.36%

Table 3 summarizes our sampling procedure. Overall in 2009 and 2010, the

percentage of fraudulent cases (zero claim payoff) is around 6%. In order to capture

enough fraud cases in the training sample to analyze its predictor variables, we use non-

random sampling, so in our sample the percentage of fraudulent cases increases to

39%.7 We will adjust for non-random sampling in specification 2

3.3 Descriptive Statistics

Table 4 gives a complete summary of variable definitions and descriptive statistics for

our sample. Overall, the data provides information on three different levels: first,

characteristics of the insured (sex, age, occupation, marital status, and income); second,

characteristics on healthcare provider and service (hospital type, days of hospital stay,

total cost, composition of the total cost); third, characteristics on policy (coverage,

7 Among all 491 fraud cases in 2009 and 2010, we aim to capture all information, but due to duplication of claims

and missing information, we were left with 77% of all fraud claims, resulting in 379 sampled fraudulent cases.

Among all 7582 legitimate claims, we randomly selected 600 claims (300 each from 2009 and 2010), and due to

duplication and missing information, resulting in 585 sampled legitimated cases comprising 7.7% of all legitimate

claims.

10

claim history information and number of policies purchased from other insurance

companies). Table 5 shows some descriptive measures for the two subsamples of

fraudulent and legitimate claims in comparison.

11

Table 4 Variable Definition and Summary Statistics

Variable Definition Mean Standard

Deviation Minimum Maximum

Dependent variable

Fraud1 Equals 1 if the claim is rejected completely, 0 otherwise. 0.394 0.489 0 1

Characteristics of the Insured

sex Equals 1 if male, 0 if female. 0.496 0.500 0 1

Age_claim Age of insured when the claim is filed. 28.150 19.559 0 60

child_dummy Equals 1 if the insured’s age is between 0 and 18 when the claim is filed, 0

otherwise. 0.332 0.471 0 1

adult_dummy Equals 1 if the insured’s age is between 19 and 59 when the claim is filed, 0

otherwise. 0.663 0.473 0 1

elder_dummy Equals 1 if the insured’s age is or above 60 when the claim is filed, 0 otherwise. 0.005 0.072 0 1

occupation A standard classification of occupation type from 1 to 6 with the greater number

corresponding to greater risk. 2.109 0.847 0 4

marital Equals 1 if married, 0 otherwise. 0.614 0.487 0 1

income Individual annual income. 60,885 49,831 6,000 500,000 Characteristics of healthcare provider and service

hosp_type

Hospital type equals 3 if it’s a grade III-A hospital, equals 2 if it’s a grade III

hospital, equals 1 if it’s a grade II-A hospital, and 0 otherwise. The grade III-A

hospitals are the best ones in China.

1.890 1.048 0 3

hosp_rec Equals 2 if the hospital is on the recommendation list of insurer, equals 1 if it is

assigned hospital of insurer, and 0 otherwise. 1.038 0.712 0 2

hosp_rec_dummy1 Equals 1 if the hospital is a qualified hospital of insurer but not on the 0.492 0.500 0 1

12

recommendation list, 0 otherwise.

hosp_rec_dummy2 Equals 1 if the hospital is not a qualified hospital of insurer, 0 otherwise. 0.235 0.424 0 1

hosp_day Number of days that the insured stayed in hospital this time. 14.020 13.900 0 218

hosp_day_pre Number of days for previous hospital stays under this policy. 0.541 3.635 0 72

tot_cost Total expenditure. 8,209 18,943 262 476,385

bed_per Percentage of expenditure on bed cost. 0.083 0.117 0.000 1.000

med_per Percentage of expenditure on medicine. 0.453 0.223 0.000 1.000

care_per Percentage of expenditure on care (labor). 0.017 0.029 0.000 0.500

diag_per Percentage of expenditure on diagnosis service. 0.013 0.036 0.000 0.502

treat_per Percentage of expenditure on treatment. 0.184 0.162 0.000 1.000

test_per Percentage of expenditure on lab test. 0.193 0.152 0.000 0.900

oper_per Percentage of expenditure on operation cost. 0.058 0.121 0.000 0.647

Characteristics of the policy

coverage_type The level of coverage (corresponding to levels in table 1). 1.130 0.380 1 3

self_policyholder Equals 1 if the insured is the policy holder and 0 otherwise. 0.563 0.496 0 1

renew The total number of years since the insured first purchased this product. 2.980 1.770 1 7

num_other_policy Number of valid policy the insured purchased from other insurance company. 0.078 0.318 0 3

self_claim Equals 1 if the insured filed the claim himself, and 0 otherwise. 0.733 0.443 0 1

claim_duration Number of days between policy commencement date and hospital admission date. 193.351 97.493 0 364

file_duration Number of days between hospital admission date and claim material submission

date. 68.627 84.520 6 829

claimfreq_pre Number of claims filed prior to current claim. 0.736 1.679 0 18

13

Table 5 Summary Statistics for Two Subsamples

Variable

Observed Fraudulent Claims Observed legitimate Claims

Mean Standard

Deviation Minimum Maximum Mean

Standard

Deviation Minimum Maximum

Mean

Difference P-Value

Dependent variable

fraud1 1.000 0.000 1 1 0.000 0.000 0 0

Characteristics of the Insured

sex 0.464 0.499 0 1 0.517 0.500 0 1 0.0527 0.110

age_claim 31.369 17.612 0 60 26.06

0 20.470 0 60 -5.3095*** 0.000

child_dummy 0.240 0.428 0 1 0.392 0.489 0 1 0.1520*** 0.000

adult_dummy 0.755 0.431 0 1 0.603 0.490 0 1 -0.1519*** 0.000

elder_dummy 0.005 0.073 0 1 0.005 0.072 0 1 -0.0001 0.977

occupation 1.939 0.816 0 4 2.219 0.849 0 4 0.2799*** 0.000

marital 0.686 0.465 0 1 0.567 0.496 0 1 -0.1192*** 0.000

income 62,244 55,267 10,000 500,000 60,00

3 45,989 6,000 500,000 -2241.0330 0.496

Characteristics of healthcare provider and

service

hosp_type 1.923 1.090 0 3 1.868 1.020 0 3 -0.0553 0.424

hosp_rec 0.860 0.708 0 2 1.154 0.691 0 2 0.2940*** 0.000

hosp_rec_dumm

y1 0.480 0.500 0 1 0.500 0.500 0 1 0.0198 0.549

hosp_rec_dumm

y2 0.330 0.471 0 1 0.173 0.379 0 1 -0.1569*** 0.000

hosp_day 13.960 12.023 0 113 14.05

8 15.002 0 218 0.0978 0.915

14

hosp_day_pre 0.322 2.251 0 22 0.683 4.297 0 72 0.3613 0.132

tot_cost 10,392 27,077 262 476,385 6,792 10,565 515 145,878 -3600.7280*** 0.004

bed_per 0.088 0.170 0.000 1.000 0.080 0.062 0.000 0.613 -0.0082 0.286

med_per 0.430 0.261 0.000 1.000 0.468 0.193 0.000 1.000 0.0384*** 0.009

care_per 0.015 0.036 0.000 0.500 0.018 0.023 0.000 0.234 0.0027 0.156

diag_per 0.015 0.054 0.000 0.502 0.011 0.017 0.000 0.215 -0.0037 0.119

treat_per 0.191 0.188 0.000 1.000 0.179 0.144 0.000 1.000 -0.0116 0.278

test_per 0.181 0.146 0.000 0.801 0.201 0.155 0.000 0.900 0.0202** 0.044

oper_per 0.081 0.139 0.000 0.647 0.043 0.105 0.000 0.629 -0.0377*** 0.000

Characteristics of the policy

premium 703 275 326 2398 734 245 326 1778 31.4033* 0.065

coverage_type 1.161 0.440 1 3 1.110 0.334 1 3 -0.0514** 0.040

self_policyholder 0.633 0.483 0 1 0.517 0.500 0 1 -0.1161*** 0.000

renew 2.171 1.537 1 7 3.502 1.715 1 7 1.3307*** 0.000

num_other_polic

y 0.069 0.301 0 3 0.084 0.328 0 3 0.0153 0.466

self_claim 0.789 0.409 0 1 0.697 0.460 0 1 -0.0920*** 0.002

claim_duration 185.285 93.049 0 364 198.5

86 99.980 0 363 13.3007** 0.039

file_duration 81.277 97.303 7 829 60.41

8 74.010 6 650 -20.8592*** 0.000

claimfreq_pre 0.327 0.934 0 8 1.002 1.977 0 18 0.6745*** 0.000

15

4 Model and Methodology

4.1 Discrete-choice Model

The model takes the form of a binary probit regression with the dependent variable

equal to one if the claim is identified as a fraudulent case. Assume the following

functional relationship:

*

i i iY X e

*

iY is the latent variable. iX is a vector of the observed explanatory variables. is

a vector of unknown parameter, and ie is a disturbance term. The claim will be

determined to be fraudulent if * 0iY , otherwise it’s legitimate. Let the observed

indicator of fraud be iY , then we have:

*1, if 0

0, otherwise

i i

i

Y Y

Y

The probability of fraud is

*Pr( 1| ) Pr( 0 | )

Pr( 0 | )

Pr( )

1 ( )

i i i i

i i i

i i

i

Y X Y X

X e X

e X

F X

The probability of the claim being legitimate is

*Pr( 0 | ) Pr( 0 | )

Pr( 0 | )

Pr( )

1 ( )

i i i i

i i i

i i

i

Y X Y X

X e X

e X

F X

where ( )F is the cumulative distribution function of ie .

If we assume that ie follows a normal distribution, it is a probit model that we choose.

Let the cumulative distribution function of standard normal distribution be ( ) , then

Pr( 1| ) ( )i i iY X X

16

Pr( 0 | ) 1 ( )i i iY X X

The log-likelihood function is

1

1ln [ ln ( ) (1 ) ln(1 ( ))]

n

i i i i

i

L Y X Y Xn

(1)

Due to the sampling method and nature of our data, we improve probit model in two

directions in the following two session. Model 1 in session 4.2 addresses the over-

sampling problem and model 2 in session 4.3 attempts to address the misclassification

problem.

4.2 Probit Model with Weighted Exogenous Sampling Maximum Likelihood

Estimation

Overall, 6% of all claims in 2009 and 2010 are fraudulent, but in our sample fraudulent

cases increases to 39% because of an oversampling of fraudulent claims. To adjust

for the oversampling, we follow Manski and Lerman (1977) to include a weighted

exogenous sampling maximum likelihood (WESML) estimator. It modifies the

classic log-likelihood function and provides a consistent and asymptotically normal

WESML estimator. Artis, Ayuso and Guillen (1999) use this method to correct the

oversampling of fraud claims in auto insurance.

Consider the following specific weighted exogenous sampling likelihood function

corresponding to our model.

1 0

{ 1} { 0}

ln ( | ) ln( ) ln(1 )i i

w i i

y y

L y p p

(2)

Where,

11

2

1

1

10

2

Here 1 is the percentage of fraudulent samples in the total claims (population), and

2 is the percentage of fraudulent samples in the sample. The summary of weighted

factors are given in Table 6. We obtain the estimates by maximizing equation(2).

17

Table 6 Summary of the weighted factors

Year 1 2 1 0

2009 4.01% 34.37% 1.463 0.117

2010 5.33% 43.75% 1.683 0.122

Total 4.69% 39.36% 1.572 0.119

4.3 Maximum Likelihood Estimation with Omission Error

Detecting fraud is a classification problem. There are two types of misclassification,

but in this paper we assume that all fraudulent claims are correctly classified and the

only possible misclassifications is omission error (undetected fraudulent claims by

insurer). Following the method proposed by Hausman et al. (1998), we take the

omission error into consideration in model 2, and estimate the percentage of fraudulent

claims that are not detected. Artis, Ayuso and Guillen (2002) also applies this method

to auto insurance market.

Assume a regression model for *

iY such that:

*

i i iY X e

Let iY be a dichotomous variable indicating presence of fraud such that:

*1, if 0i iY Y

0, otherwise.iY

If there is no measurement error in the response, iY indicates the true outcome with the

following probability:

*Prob( =1| )=Prob( >0| )i i i iY X Y X

Within the misclassification frame work, assume that the observed dependent variable

could be different from the underlying outcome. Call the observed binary variable iY .

Assume that the probability of misclassification is as follows:

0 Prob( =1| =0)i iY Y

1 Prob( 0 | 1)i iY Y

18

In our specification, we assume 0 0 , and estimate 1 .

The conditional expectation of the observed dependent variable is given by:

1( | ) (1 ) ( )i i iE Y X X

Where ( ) is the cumulative distribution function of the standard normal distribution.

The corrected log-likelihood function is:

1 1

1

1ln [ ln(1 ) ( ) (1 ) ln(1 (1 ) ( ))]

n

i i i i

i

L Y X Y Xn

(3)

1 can be estimated by maximizing the log-likelihood function in equation (3).

5 Empirical Results and Discussions

Corresponding to model specifications in section 4, we consider three specifications in

our model. First, we use probit model to obtain the estimation. Second, we take into

account the effect of the over-representation of fraudulent claims in our sample. And in

the third specification, the omission error is considered.

The dependent variable is the claim decision judged by the insurance company. We treat

claim completely rejected as fraudulent claim, and therefore the dependent variable

equals to one, and zero otherwise. The explanatory variables include indicators for

fraudulent claim as well as control variables of the insured.

We perform a likelihood ratio test, and the result is 18.9 with 1 degree of freedom. This

indicates that a significant improvement occurs when we includes the omission error

parameter (specification 3), compared with the restricted model with no omission errors

(specification 2).

In specification 3, we find that the parameter 1 estimating the probability of omission

error is significantly different from zero. The result shows that the fraudulent claims are

underestimated by 4.66 percent. The complete regression results are shown in Table 7.

19

Table 7 Regression Results

SPECIFICATIONS 1: Probit 2: over-sampling

addressed

3: omission error

addressed

VARIABLES

sex 0.00518 0.0230 0.0036

(0.0973) (0.160) (0.483)

child_dummy -0.501* -0.660 -2.411*

(0.275) (0.442) (1.421)

elder_dummy 0.598 0.493 1.309

(0.660) (1.004) (2.665)

occupation -0.137* -0.0911 -0.263

(0.0751) (0.124) (0.338)

marital -0.230 -0.277 -1.578

(0.191) (0.316) (1.252)

lnincome 0.0117 0.0217 -0.218

(0.0855) (0.143) (0.469)

hosp_type 0.0297 0.0143 -0.0225

(0.0505) (0.0800) (0.17)

hosp_rec_dummy1 0.146 0.138 0.29

(0.119) (0.187) (0.414)

hosp_rec_dummy2 0.738*** 0.760*** 1.814**

(0.138) (0.230) (0.899)

hosp_day -0.0108** -0.0109 -0.042*

(0.00455) (0.00731) (0.0224)

hosp_day_pre -0.0495** -0.0532* -0.165*

(0.0203) (0.0311) (0.0885)

lntot_cost 0.249*** 0.199* 0.166

(0.0725) (0.113) (0.259)

bed_per 1.264*** 1.042 1.749

(0.490) (0.761) (1.528)

care_per -2.508 -2.510 -8.015

(1.604) (2.314) (6.27)

diag_per 3.569* 2.833 4.797

(1.824) (3.096) (7.642)

treat_per 0.00177 0.181 1.52

(0.299) (0.493) (1.628)

test_per -0.123 0.0447 0.698

(0.330) (0.534) (1.714)

oper_per 0.707* 0.895 4.767

(0.397) (0.664) (2.922)

coverage_type 0.177 0.133 0.319

(0.129) (0.206) (0.474)

self_policyholder 0.128 0.213 0.922

(0.149) (0.244) (0.66)

renew -0.319*** -0.286*** -0.785**

(0.0316) (0.0504) (0.345)

num_other_policy -0.188 -0.132 0.052

(0.148) (0.231) (0.61)

self_claim -0.0796 -0.146 -0.196

(0.175) (0.290) (0.809)

claim_duration -0.00195*** -0.00181** -0.00518**

(0.000490) (0.000807) (0.00214)

file_duration 0.00173*** 0.00216* 0.0246**

(0.000600) (0.00119) (0.0124)

claimfreq_pre -0.0795** -0.0653 -0.0318

(0.0399) (0.0567) (0.114)

20

Constant -1.193 0.436 6.115

(1.129) (1.882) (6.49)

1 - - 0.0466***

- - (0.0135)

0 - - 0

- - -

Pseudo R2 0.2305 0.2057 -

Observations 963 963 963

*** p<0.01, ** p<0.05, * p<0.1, standard errors are in parentheses

In Table 7, we find that most of the parameters’ signs are consistent with our expectation.

Table 8 lists the expected versus the obtained parameter signs.

Table 8 Comparison for the Expected and the Obtained Parameter Signs

Variable Obtained sign Expected Sign

hosp_type Inconsistent8 -

hosp_rec_dummy1 + +

hosp_rec_dummy2 + +

hosp_day - +

hosp_day_pre - -

lntot_cost + +

bed_per + indefinite

care_per - indefinite

diag_per + indefinite

treat_per + indefinite

test_per inconsistent indefinite

oper_per + indefinite

coverage_type + +

self_policyholder + indefinite

Renew - -

num_other_policy inconsistent -

self_claim - -

claim_duration - -

file_duration + +

claimfreq_pre - -

8 Inconsistent indicates the signs of coefficient are not all the same across three different specifications.

21

Most of the signs of parameter are consistent in all three specifications except for

coefficients of income (lnincome), hospital type (hosp_type), test percentage (test_per)

and number of other policies (num_other_policy). The coefficients of these four

explanatory variables are not significant though.

As shown in Table 8, we find several indicators for fraudulent medical claims. And

most of them are related to either medical service and provider, or measurement of

insurance policy.

The hosp_rec_dummy2 variable demonstrates a strong negative relationship with a

claim being fraudulent. It shows that if the insured seeks medical service in an

unqualified provider of the insurer, it’s more likely to be a fraudulent case. However,

the hosp_rec_dummy1 variable which indicates it’s a qualified provider but not being

recommended by insurer is not significant. But it does have a positive sign as expected,

showing that compared to providers recommended by insurer, those not on the

recommendation list have a higher probability of committing fraud.

Both the length of hospital stay in this time and in prior are significant indicators of

committing medical fraud. And all signs in three specifications are negative, meaning

that the longer the insured stays in the hospital this time or in prior, the lower the

probability of fraudulent claims is. The expected sign of number of hospital stay is

different from our original hypothesis. We propose two reasons. First, the longer the

hospital stay is, the higher the probability that the claim will be subjected to scrutinize

in claim handling, therefore the insured who plan to commit fraud will choose to keep

the hospital stay in a reasonable limit. Second, there is coverage limit for bed charge

that could be reimbursed by this insurance product, therefore if it’s a planned fraud, the

fraudster will limit the length of his/her stay.

The influence of the total cost is significant at the 1 percent significance level in

specification 1 and is significant at the 10 percent significance level in specification 2.

The parameter signs in all three specifications are positive, indicating that the higher

the total cost, the higher the probability of fraudulent claims is, which is consistent with

22

our expectation.

Different from results in prior study (Shin et al., 2012), the influence of composition of

expenditure are not significant in general. Only bed charge, diagnosis expenditure and

operation cost are significant at 10 percent level in specification 1, but none is

significant when over sampling or omission error is taken into consideration. The major

reason we propose is that the prior studies either controlled for diagnosis information

or just focused on certain kind of disease (Ireson, 1997). In our sample, we have limited

number of observations and various disease types, therefore, without controlling for

disease type, the cost composition cannot be used to predict fraudulent cases.

The renew variable indicating the total number of years since the insured first purchased

this product. Consistent with our expectation, the further the insured renewed with the

same insurer, the less likely he/she commits fraud.

The variables of claim_duration and file_duration are both significant in all three

specifications, and the signs are consistent with our expectation. The claim_duration

measures the number of days between the policy commencement and hospital

admission. The negative sign shows that the insured who would like to commit fraud is

eager to forge the accidents. The file_duration measures the number of days between

hospital admission to claim material submission. The positive sign shows that insured

who spend more time on preparing the claim material are more likely to commit fraud.

The number of claims filed prior to the current claim has a negative impact on the

probability of fraud as expected, but it’s only significant at the 5 percent level in

specification 1.

In our set of control variables regarding the characteristics of the insured, most of them

are not statistically significant when omission error is considered, except for the

child_dummy. The sign of child_dummy parameter is negative, as expected, since

children are less likely to be involved in medical insurance fraud.

23

Table 9 Marginal effects

MODELS Probit Model 1 Model 2

VARIABLES

sex 0.00193 0.00282 9.253E-07

(0.0362) (0.0197) -

child_dummy -0.187* -0.0810 -6.204E-04

(0.103) (0.0543) -

elder_dummy 0.223 0.0605 3.367E-04

(0.246) (0.123) -

occupation -0.0510* -0.0112 -6.779E-05

(0.0280) (0.0152) -

marital -0.0857 -0.0341 -4.061E-04

(0.0712) (0.0386) -

lnincome 0.00435 0.00266 -5.599E-05

(0.0318) (0.0175) -

hosp_type 0.0111 0.00176 -5.780E-06

(0.0188) (0.0098) -

hosp_rec_dummy1 0.0544 0.0170 7.454E-05

(0.0443) (0.0230) -

hosp_rec_dummy2 0.275*** 0.0933*** 4.668E-04

(0.0511) (0.0285) -

hosp_day -0.00401** -0.00133 -1.080E-05

(0.00169) (0.00090) -

hosp_day_pre -0.0184** -0.00654* -4.238E-05

(0.00753) (0.00384) -

lntot_cost 0.0926*** 0.0244* 4.259E-05

(0.0270) (0.0139) -

bed_per 0.471** 0.128 4.499E-04

(0.183) (0.0920) -

care_per -0.934 -0.308 -2.062E-03

(0.598) (0.282) -

diag_per 1.330* 0.348 1.234E-03

(0.681) (0.375) -

treat_per 0.000660 0.0223 3.911E-04

(0.112) (0.0606) -

test_per -0.0459 0.00549 1.795E-04

(0.123) (0.0655) -

oper_per 0.263* 0.110 1.226E-03

(0.148) (0.0809) -

coverage_type 0.0660 0.0163 8.210E-05

(0.0481) (0.0253) -

self_policyholder 0.0475 0.0261 2.372E-04

(0.0555) (0.0299) -

renew -0.119*** -0.0351*** -2.019E-04

(0.0117) (0.00634) -

num_other_policy -0.0699 -0.0162 1.339E-05

(0.0551) (0.0285) -

self_claim -0.0297 -0.0179 -5.051E-05

(0.0650) (0.0357) -

claim_duration -0.000728*** -0.000223** -1.333E-06

(0.000182) (0.0000988) -

file_duration 0.000645*** 0.000265* 6.320E-06

(0.000224) (0.000144) -

claimfreq_pre -0.0296** -0.00802 -8.180E-06

(0.0148) (0.00702) -

24

*** p<0.01, ** p<0.05, * p<0.1, robust standard errors clustered by groups are in parentheses

Marginal effects at the means of independent variables are reported in Table 9. We note

that the marginal effect in specification 3 is very small, compared to the other 2 models.

The underlying reason is that our latent variable *Y in specification 3 is higher

compared with the ones in specification 1 and 2. In a probit model, the probability of a

case being fraudulent is ( )X ，therefore the marginal effect of iX is ( ) iX , in

which ( ) denotes the density function of a standard normal distribution. X

represents the latent variable *Y ，and could be calculated after a regression assuming

each of iX taking its mean. As we taking both over representation and omission error

into consideration in specification 3, the predicted *Y becomes larger, resulting in the

marginal effect to be diminishing, as shown in Figure 1.

Figure 1: Marginal Effects in Different Specifications

To check the adequacy of our models, we report the classification results in Table 10,

11 and 12. We chose the threshold of predicting fraudulent claim using a grid search

framework, and we made compromise between the best classification in whole sample

and the best classification within fraudulent cases.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

-2.5

-2.3

-2.1

-1.9

-1.7

-1.5

-1.3

-1.1

-0.9

-0.7

-0.5

-0.3

-0.1

0.1

0.3

0.5

0.7

0.9

1.1

1.3

1.5

1.7

1.9

2.1

2.3

2.5

Y

Y* in specification 2



( )Y

25

Table 10 Classification Table for Specification 1

Predicted Type

Legitimate Fraudulent Total

Observed Type

Legitimate 474 110 584

Fraudulent 145 234 379

Total 619 344 963

When the estimated probability of fraud exceeded 0.5,

the predicted type was fraud.


Predicted Type


Observed Type



Total 359 604 963




Predicted Type


Observed Type



Total 393 570 963



In specification 1 (the basic probit model), using threshold of 0.59, the total percentage

of observations being correct classified was 74 percent, which is acceptable. The

conditional percentage of legitimate claims that were correctly classified was 81 percent.

However, the conditional percentage of fraudulent claims that were correctly classified

was only 62 percent, showing that the probit model without weighted sampling and

omission error is not ideal for detecting medical insurance fraud.

In specification 2, the threshold was set to 0.8 since it yields the highest overall

classification percentage while keeping the correctly classified fraudulent cases above

85%. In this case, the conditional percentage of fraudulent claims being correctly

classified was about 89 percent and the percentage of legitimate claims being correctly

9 For a complete result of threshold grid search, please refer to appendix.

26

classified was 54 percent. Overall, 68 percent of observations are correctly classified.

In this way, the model is more effective in detecting fraud than the basic probit model.

Using the same criteria as in specification 2, the threshold was set to 0.9 in specification

3 to yield the best compromise between overall performance and the segment of

fraudulent claim. The conditional percentage of fraudulent claims being correctly

classified was about 86 percent and the percentage of legitimate claims being correctly

classified was 58 percent. The total percentage of correct classification was 67 percent,

which is acceptable in terms of both adequacy and efficiency in detecting the medical

insurance fraud.

6 Concluding Remarks

Health insurance fraud causes higher insurance prices and significant welfare loss to

society, therefore, detecting fraud is important for improving efficiency in the insurance

industry. The fraud detection techniques have been studied extensively by both

academics and industry analysts, yet most empirical studies focus on fraud in health

insurance in developed countries and there is little evidence on the nascent commercial

health insurance market in China.

We use a discrete choice model considered for over-sampling and omission error to

identify the predictive factors of medical insurance fraud, and we find hospital’s

qualification, total cost of healthcare, policyholder’s renewal status, claim duration and

file duration are contributing factors of medical insurance fraud.

Our research provide a significant contribution by broadening the understanding of

predictive variables for health insurance fraud in China. We expect our analysis to

help insurers in China to better evaluate their claims and improve the efficiency and

accuracy of claim management.

27

Appendix：

Grid search result for thresholds in classification is shown in table 13.

Table 13 The percentage of correctly classified claims

under different levels of threshold

Threshold

Correctly

classified % 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95

Specification 1

total* 73.31 73.94 73.52 73.52 72.79 71.34 69.68 66.87 65.01 - - -

fraudulent* 75.46 68.87 61.74 55.67 48.28 39.05 30.34 21.11 14.78 - - -

legitimate* 71.92 77.23 81.16 85.10 88.70 92.29 95.21 96.58 97.60 - - -

Specification 2

total 44.65 47.04 48.08 50.47 54.00 57.42 59.81 63.86 67.91 71.13 73.10 72.48

fraudulent 100.00 99.74 99.47 98.94 98.42 97.89 96.83 94.20 88.92 81.00 70.71 50.40

legitimate 8.73 12.84 14.73 19.01 25.17 31.16 35.79 44.18 54.28 64.73 74.66 86.82

Specification 3

total 55.24 56.49 58.15 58.98 60.44 61.99 63.03 64.80 65.94 67.71 68.95 70.61

fraudulent 99.47 98.94 98.68 98.42 98.15 97.89 97.36 96.04 93.93 91.29 85.75 67.28

legitimate 26.54 28.94 31.85 33.39 35.96 38.70 40.75 44.52 47.77 52.40 58.05 72.77

Note:

Total: The total percentage of observations being correct classified.

Fraudulent: The percentage of fraudulent claims being correctly classified.

Legitimate: The percentage of legitimate claims being correctly classified.

Reference:

Ai, J., P. Brockett, and L. Golden (2009) “Assessing Consumer Fraud Risk in Insurance

Claims: An Unsupervised Learning Technique Using Discrete and Continuous

Predictor Variables”, North American Actuarial Journal, 13(4):439-458.

Artís, M., M. Ayuso, and M. Guillén (1999) “Modelling Different Types of Automobile

Insurance Fraud Behaviour in the Spanish Market”, Insurance: Mathematics and

Economics, 24: 67-81.

Artís, M., M. Ayuso, and M. Guillén (2002) “Detection of Automobile Insurance Fraud

with Discrete Choice Models and Misclassified Claims”, The Journal of Risk and

Insurance, 69(3):325-340.

Belhadji, El Bachir, G. Dionne, and F. Tarkhani (2000), "A Model for the Detection of

Insurance Fraud", The Geneva Papers on Risk and Insurance, 25(4):517-538.

Brockett, P. L., R. Derrig, L. Golden, A. Levine, and M. Alpert (2002), The Journal of

Risk and Insurance, 69(3): 341-371.

Brockett, P. L., X. Xia, and R. A. Derrig (1998) "Using Kohonen's Self-Organizing

28

Feature Map to Uncover Automobile Bodily Injury Claims Fraud", The Journal of Risk

and Insurance, 65(2): 245-274.

Caudill, S. B., M. Ayuso, and M. Guillén (2005) “Fraud Detection Using a Multinomial

Logit Model with Missing Information”, The Journal of Risk and Insurance, 72(4):

539-550.

Derrig, R. A. (2002) “Insurance Fraud”, The Journal of Risk and Insurance, 69(3): 271-

287.

Derrig, R.A., and K.M. Ostaszewski (1995),"Fuzzy Techniques of Pattern

Recognition in Risk and Claim Classification", The Journal of Risk and Insurance,

62(3), 447-482.

Hausman J. A., J. Abrevaya, and F. M. Scott-Morton (1998) “Misclassification of the

Dependent Variable in a Discrete-response Setting”, Journal of Econometrics, 87: 239-

269.

He, H., J. Wang, W. Graco, and S. Hawkins (1997),"Application of Neural Networks to

Detection of Medical Fraud", Expert Systems with Applications, 13(4): 329-336.

He, H., W. Graco, and X. Yao (1999), "Application of Genetic Algorithm and k-Nearest

Neighbour Method in Medical Fraud Detection", 2nd Asia-Pacific Conference on

Simulated Evolution and Learning (SEAL 98), Nov. 24-27, 1998.

Ireson, C. L (1997), "Critical Pathways: Effectiveness in Achieving Patient Outcomes",

Journal of Nursing Administration, 27(6): 16-23.

Kou, Y., C. Lu, S. Sirwongwattana, and Y. Huang (2004),"Survey of Fraud Detection

Techniques", International Conference on Networking, Sensing & Control Taipei,

Taiwan, March 21-23, 2004.

Li, J., K. Huang, J. Jin, and J. Shi (2008) "A Survey on Statistical Methods for

Healthcare Fraud Detection", Health Care Manage Science, 11:275-287.

Liou, F., Y. Tang, and J. Chen (2008), "Detecting Hospital Fraud and Claim Abuse

through Diabetic Outpatient Services", Health Care Manage Science, 11:353–358.

Major, J. A. and D. R. Riedinger (2002), "EFD: A Hybrid Knowledge/Statistical-Based

System for the Detection of Fraud", The Journal of Risk and Insurance, 69(3):309-324.

Manski, C. and S. R. Lerman (1977) “The Estimation of Choice Probabilities from

Choice Based Samples”, Econometrica, 45(8):1977-1988.

Mao, L. (2008) “Research on the Health Insurance Anti-fraud in China”, working

paper, http://www.docin.com/p-224528482.html.

Ortega, P. A., C. J. Figueroa and G. A. Ruz (2006), "A Medical Claim Fraud/Abuse

Detection System based on Data Mining: A Case Study in Chile", In Proceedings of

International Conference on Data Mining, Las Vegas, Nevada, USA.

Shin, H., H. Park, J. Lee, and W. C. Jhee (2012), "A Scoring Model to Detect Abusive

29

Billing Patterns in Health Insurance Claims", Expert Systems with Applications,

39:7441-7450.

Stefano, B., and F. Gisella (2001), "Insurance Fraud Evaluation A Fuzzy Expert

System", 2001 IEEE International Fuzzy Systems Conference.

Viaene, S., R. A. Derrig, B. Baesens, and G. Dedene (2002), "A Comparison of State-

of-the-Art Classification Techniques for Expert Automobile Insurance Claim Fraud

Detection", The Journal of Risk and Insurance, 69(3):373-421.

Weisberg, H. I., and R. A. Derrig (1991),"Fraud and Automobile Insurance: A Report

on the Baseline Study of Bodily Injury Claims in Massachusetts", Journal of Insurance

Regulation, 9: 497-541.

Weisberg, H. I., and R. A. Derrig (1995),"Identification and Investigation of Suspicious

Claims, in: AIB Cost Containment/Fraud Filing", (DOI Docket R95-12) (Boston, Mass.:

Automobile Insurers Bureau of Massachusetts).

Weisberg, H. I., and R. A. Derrig (1998),"Quantitative Methods for Detecting

Fraudulent Automobile Bodily Injury Claims", Risques, July-September: 35: 75-99.

Yamanishi, K., J. Takeuchi, G. Williams, and P. Milne (2004), "On-line Unsupervised

Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms", Data

Mining and Knowledge Discovery, 8:275-300.

Yang, W., S. Hwang (2006), "A Process-mining Framework for the Detection of

Healthcare Fraud and Abuse", Expert Systems with Applications, 31:56-68.

Documents

Detection of Health Insurance Fraud with Discrete Choice ... · PDF fileModel: Evidence from Medical Expense Insurance in China ... that health insurance fraud costs up to $80