Upload
doankhue
View
214
Download
1
Embed Size (px)
Citation preview
1
Detection of Health Insurance Fraud with Discrete Choice
Model: Evidence from Medical Expense Insurance in China
Abstract:
Health insurance fraud increases the inefficiency and inequality in our society. To
address the widespread problem, cost effect techniques are in need to detect fraudulent
claims. With a dataset from medical expense insurance in China, we propose a discrete
choice model to identify predicting factors of fraudulent claims, and we address the
major limitations of discrete choice model by considering over sampling of fraudulent
cases, as well as mislabeling of legitimate claims (omission error). Our results show
that a few factors, such as hospital’s qualification and policyholder’s renewal status,
could be used to predict fraudulent claims for further investigation.
Key words:
Medical expense insurance, insurance fraud, discrete choice model, predicting factor,
omission error
2
1 Introduction
Health insurance is a critical mechanism for financing healthcare need in a modern
society. Health insurance fraud comes as an unwanted byproduct, contributing to rising
health insurance costs and resulting in significant social welfare loss. According to
the Global Health Care Anti-fraud Network (GHCAN), health insurance fraud has
become a worldwide problem suffered by both developed countries with sophisticated
healthcare systems and developing countries with emerging health insurance markets.
Globally, it is estimated that the annual total cost of health insurance fraud could reach
$260 billion, or 6% of global healthcare spending.1 In the U.S. alone, it is estimated
that health insurance fraud costs up to $80 billion annually, accounting for 3% of the
annual national health care spending. 2 In an emerging market such as China,
commercial health insurance market is still at a nascent stage in terms of premium
income 3 , but fraud is already wide-spread, causing losses equal to 10%-30% of
premium income (Mao, 2008; Munich Re, 2013).4 The China Insurance Regulatory
Commission (CIRC) estimated the growth rate of insurance fraud cases was around 20%
in 2011, and in response to this rising problem CIRC proposed to build its own
insurance anti-fraud system in 2012.5
Multiple stakeholders should be involved to detect fraudulent claims effectively and
accurately, including academia, the insurance industry, regulatory institutions and
international organizations such as the GHCAN. In a developed market, all
stakeholders work coherently and develop an advanced fraud detection system using
abundant data and predictive analytics to provide efficient fraud management.6 In an
emerging market such as China, the typical procedure to detect health insurance fraud
still follows simple guidance criteria such as claim amount threshold, and then largely
relies on the experience and skill of an individual claim adjuster to perform a manual
investigation. Both the efficiency and accuracy could be improved dramatically with
an automated fraud detection system. Despite the urgent need, as far as we know,
there has been no study focused on health insurance fraud in China yet. We attempt to
1 http://www.ghcan.org/challenge.html 2 http://www.fbi.gov/about-us/investigate/white_collar/health-care-fraud 3 In 2012, the premium income of health insurance in China is 86.3 billion Yuan ($14 billion), comprising merely
8% of the total life insurance premium. 4 http://www.docin.com/p-718280146.html 5 http://www.circ.gov.cn/tabid/5171/InfoID/219312/frtid/5225/Default.aspx 6 http://www.fico.com/en/products/fico-insurance-fraud-manager-health-care-edition/
3
fill this gap and provide evidence on contributing factors in predicting health insurance
fraud in this emerging market.
We develop our hypotheses and theoretical background in the following, then we
present our data and the empirical models, as well as discuss the results. We present
the concluding remarks in the end.
2 Theory and Hypotheses Development
2.1 Overview on Fraud Detection Methodology
The methods of detecting insurance fraud fall largely into two groups. The supervised
learning methods make use of prior information on the dependent variable (fraudulent
or legitimate) in a training subset of data to obtain patterns in predicting variables. Some
examples of supervised learning methods include discrete choice models (Artis et al.,
1999, 2002; Belhadji et al., 2000; Caudill et al., 2005), other standard econometric
models (Weisberg and Derrig, 1991, 1995, 1998), the expert system (Major and
Riedinger, 2002; Stefano and Gisella, 2001), as well as active learning and cost-
sensitive learning methods. Unsupervised learning methods do not rely on
predetermined status of dependent variable but extract information from the predicting
variables directly. Some examples include cluster analysis, unsupervised neural
network (Brockett et al., 1998) and other data mining methods (Kou et al., 2004; Yamanishi,
2004).
Compared to unsupervised methods, supervised methods tend to be more accurate since
additional information on dependent variable is employed in the training sample. But
the major limitations are: first, it could difficult and (or) costly to obtain “labels” for
training sample; second, due to the nature of fraud, unbalanced data (too few fraudulent
cases compared with legitimate ones) is almost inevitable and requires specific
treatment; third, the labeling of dependent variable could be inaccurate
(misclassification problem).
In our study, we obtained a dataset with prior information of whether the claim is
fraudulent, therefore the choice of using a supervised learning method is natural.
Among different supervised methods, we choose discrete choice model. It’s
4
straightforward to use, and the results could be easily interpreted. In addition, we used
weighted exogenous sampling maximum likelihood estimation to address the
oversampling of fraudulent claims in our sample, and further consider omission error
to address the inaccuracy of predetermined labelling of dependent variable.
2.2 Literature Summary on Health Insurance Fraud Predicting Indicators
In the area of detecting insurance fraud, various methods are applied in different lines
of products as shown in Table 1.
Table 1 Summary of relevant literature
Methodology supervised learning methods unsupervised learning
methods
Derrig (2002)
Hausman et al. (1998)
Li et al. (2008)
Manski and Lerman (1977)
Brockett et al. (1998)
Kou et al. (2004)
Yamanishi (2004)
Empirical
auto
insurance
Artis et al. (1999)
Artis et al. (2002)
Belhadji et al. (2000)
Caudill et al. (2005)
Derrig and Ostaszewski (1995)
Stefano and Gisella (2001)
health
insurance
He et al. (1997)
Liou et al. (2008)
He et al. (2000)
Liou et al. (2008)
Major and Riedinger
(2002) Ortega et al. (2006)
Shin et al. (2012)
Yamanishi et al. (2004) Yang and Hwang (2006)
other lines
(BI in auto)
Viaene et al. (2002)
Weisberg and Derrig (1990,
1993, 1995, 1998)
Ai et al. (2009)
Brockett et al. (2002)
While there are a series of empirical studies on insurance fraud in auto lines (for either
property damage or bodily injury claims) (Artis et al., 1999; Brockett et al., 2002;
Caudill et al., 2005; Derrig and Ostaszewski, 1995), scholars start to present findings
in health insurance as data becomes available (He et al., 1997, Liou et al., 2008; Major
and Riedinger, 2002; Yamanishi et al., 2004). As suggested by Li et al. (2008), due to
legal issues or concerns over privacy protection, the papers presenting details on
indicators for health care fraud is scarce.
5
Most of the existing studies employ unsupervised learning methods. Major and
Riedinger (1992) analyzed Electronic Fraud Detection (EFD) using by an insurance
company, and it provides a general framework for health insurance fraud indicator
classification, and it includes five categories, i.e. financial indicators, medical logic
indicators (whether a medical situation would normally happen), abuse indicators
(frequency of treatment), logistics indicators (the place, timing and sequences of
activities) as well as identification indicators (the way providers present information).
In specific for health fraud committed by medical laboratory, Yamanishi et al. (2004)
used outlier detection method to identify the test categories (chemical, microbiology,
and immunology) distribution, the number of different patients, and the test frequency
as potential indicators to detect fraud.
Regarding abusive utilization in outpatient clinics, Shin et al. (2012) uses a scoring
model to detect outpatient abusive billing patterns using profiling information extracted
from electronic insurance claims in South Korea. They rely on domain experts to
generate an index to decide whether further investigation is warranted. It includes
measurement of various charges composition (total utilization, medications, injections,
laboratory tests, and diagnostic radiology), total charges for the five most frequent
diagnoses, rates of utilization of specific services (antibiotics and corticosteroids),
utilization of visits and prescription drugs.
Similar to Shin et al. (2012), Liou et al. (2008) also takes healthcare provider as the unit
to examine its fraudulent medical claims, and it uses three different approaches
including logistic regression, neural network and classification trees. It uses nine
variables including average days of drug dispense, average drug cost, average
consultation and treatment fees, average diagnosis fees, average dispensing service fees,
average medical expenditure, average amount claimed, average drug cost per day, and
average medical expenditure per day, and it finds eight out of the nine variables being
significant predicting ones.
Our study is different from the previous literature in three ways. First, we adopt more
sophisticated discrete choice model to detect medical fraud and this type of method was
6
not frequently used before. Second, we are the first to focus on product providing
inpatient medical expense insurance in China. Third, we focus on indicators of
individual fraudulent behavior (insured) rather than institutional behavior (healthcare
provider), therefore our results could provide more informative inference for insurer.
2.3 Hypotheses Development for Specific Indicators
We choose characteristics on healthcare provider and service (type of hospital, number
of days in hospital this time and previously under this product, total cost, and
composition of total cost across bed charge, medicine, care, diagnosis, treatment,
operation and lab test) and characteristics on policy (coverage, renew status, claim
duration, file duration and previous claim frequency etc.) as our fraud indicators,
controlling for demographics of the insured (sex, age, occupation, marital status, and
income).
In specific, we hypothesize that:
a. A few variables defining the nature of hospital are predictive of medical fraud.
The type/ranking of hospital could be predictive of fraud. Those lower ranked
community clinics could be networked more easily, therefore prone to fraudulent
behavior compared to the national wide top hospital (ranked III-A).
In addition, if a hospital is qualified provider under the insurance contract, the
probability of fraud would decrease. Furthermore, if the policyholder seeks service
from a recommended provider, the probability of fraud should also decrease.
b. The number of days stayed in hospital and total cost for current stay.
These two variables are dependent to each other to some extent. We hypothesize
that as the patient spends more days in hospital, or has a larger bill for the stay, it’s
a likely signal for fraudulent behavior. As these signals draw attention of claim
adjuster, there is a higher probability of fraudulent behavior being discovered.
c. Composition of cost.
The total cost is consisted of seven categories including bed charge, medicine
charge, diagnosis charge, treatment charge, test charge, operation charge and charge
for care (labor) delivered. If one or a few categories are dominant in the total cost,
it could be a potential signal of fraudulent claim.
d. Coverage type.
7
If it’s a planned fraud, the fraudster may tend to purchase policy with higher limit
and more comprehensive coverage, therefore, we hypothesize there is a positive
correlation between coverage type and fraud.
e. Renewal status, number of days stayed in hospital in previous claims, and number
of claim filed previously.
These three variables indicates the history of a given insured with the product. We
hypothesize that if it’s a renewed customer, it’s less likely to commit fraud.
Furthermore, if the insured filed claims previously, then he/she had undergone claim
auditing before, therefore diminishing the probability of fraud.
f. Number other policies with the same company.
We hypothesize that if the customer bought other policies (such as auto insurance),
then it’s less likely to commit fraud, because information gathered from other
policies could be used by insurer in claim auditing.
g. Self-claim preparation.
If a claim is filed and materials being prepared by insured himself/herself, we
hypothesize the probability of fraud would diminishing.
h. Claim duration.
It’s the number of days between policy commencement to hospitalization. If it’s a
planned fraud, the fraudster tends to shorten the claim duration, therefore there is a
negative correlation between claim duration and fraud.
i. File duration.
It’s the number of days between hospitalization and submission of complete claim
files. For fraudulent claims, it might take longer to get forge the material resulting
in a positive correlation between file duration and fraud.
3 Data
3.1 Medical Expense Insurance Fraud in China
There are three main types of health insurance products in China, namely medical
expense insurance, critical illness insurance and accident insurance with health expense
coverage. We chose medical expense insurance as our target product because it is the
dominant health insurance product, and the fraud is more prevalent compared to the
other two products. We obtained data of an individual inpatient medical expense
8
insurance product from a leading health insurance company in China. Insured aged
between 28 days and 59 years old are eligible to purchase this product. It is designed
with three levels of coverage, with the premium depending on age, gender and coverage
level. The coverage limits in various sub-categories are described in Table 2. There is
no deductible, and the copayment percentage is 20%. An additional coverage of 5%
of the medical expense claim payoff is provided if the insured seeks healthcare from a
recommended hospital.
Table 2 Insurance coverage for individual medical expense insurance
Low
coverage
Medium
coverage
High
coverage
Insurance
Coverage
Medical
expense
coverage
in sub-
categories
(in Yuan)
Bed charge
Average daily
limit 50 80 100
Total limit 4,500 7,200 9,000
Medicine
charge
Average daily
limit 100 150 200
Total limit 9,000 13,500 18,000
Care charge 200 500 900
Diagnosis charge 200 500 900
Treatment charge 1,500 3,000 4,500
Lab charge 2,000 4,000 6,000
Operation charge 2,000 4,000 6,000
Additional coverage (in Yuan) Additional 5% of medical expense
claim payoff
There exists a range of definitions for health insurance fraud, from hard fraud in the
form of criminal actions to soft fraud in the form of over-utilization or over-estimation
of existing expense (Ai, et.al, 2009). In this product, the major types of fraud include
concealing a pre-existing condition, forgery of medical expense receipts and documents,
as well as inflating days of inpatient service. There is virtually no consensus on the
definition of insurance fraud in the existing literature. We use the insurer’s decision as
a proxy of insurance fraud in model 1 and adjust for the insurer’s omission error in
model 2.
9
3.2 Sample Selection
We obtained data of all claims filed in 2009 and 2010 for this inpatient medical expense
insurance product. It is divided into two categories, zero payoff and non-zero payoff,
according to insurer’s claim decision. We treat zero claim payoff as definite evidence
for the existence of fraud. The non-zero payoff claims could be further divided into
fully paid (adjusted to copayment and coverage limit) and partially paid claims.
However the majority of the partially paid claims are due to the deduction of payment
from the social medical insurance program, so it would be unfair to label them as fraud.
Therefore we treat all partially paid claims as legitimate claims in our analysis, and only
regard zero claim payoff as fraud cases.
Table 3 Summary of total claims and sampled claims
Year Total #
of claim
Fraudule
nt cases
Fraud % in
population
Sampled # of
claim
Sampled
Fraudulent
cases
Fraud % in
sample
2009 3,868 236 6.10% 451 155 34.37%
2010 4,205 255 6.06% 512 224 43.75%
Overall 8,073 491 6.08% 963 379 39.36%
Table 3 summarizes our sampling procedure. Overall in 2009 and 2010, the
percentage of fraudulent cases (zero claim payoff) is around 6%. In order to capture
enough fraud cases in the training sample to analyze its predictor variables, we use non-
random sampling, so in our sample the percentage of fraudulent cases increases to
39%.7 We will adjust for non-random sampling in specification 2
3.3 Descriptive Statistics
Table 4 gives a complete summary of variable definitions and descriptive statistics for
our sample. Overall, the data provides information on three different levels: first,
characteristics of the insured (sex, age, occupation, marital status, and income); second,
characteristics on healthcare provider and service (hospital type, days of hospital stay,
total cost, composition of the total cost); third, characteristics on policy (coverage,
7 Among all 491 fraud cases in 2009 and 2010, we aim to capture all information, but due to duplication of claims
and missing information, we were left with 77% of all fraud claims, resulting in 379 sampled fraudulent cases.
Among all 7582 legitimate claims, we randomly selected 600 claims (300 each from 2009 and 2010), and due to
duplication and missing information, resulting in 585 sampled legitimated cases comprising 7.7% of all legitimate
claims.
10
claim history information and number of policies purchased from other insurance
companies). Table 5 shows some descriptive measures for the two subsamples of
fraudulent and legitimate claims in comparison.
11
Table 4 Variable Definition and Summary Statistics
Variable Definition Mean Standard
Deviation Minimum Maximum
Dependent variable
Fraud1 Equals 1 if the claim is rejected completely, 0 otherwise. 0.394 0.489 0 1
Characteristics of the Insured
sex Equals 1 if male, 0 if female. 0.496 0.500 0 1
Age_claim Age of insured when the claim is filed. 28.150 19.559 0 60
child_dummy Equals 1 if the insured’s age is between 0 and 18 when the claim is filed, 0
otherwise. 0.332 0.471 0 1
adult_dummy Equals 1 if the insured’s age is between 19 and 59 when the claim is filed, 0
otherwise. 0.663 0.473 0 1
elder_dummy Equals 1 if the insured’s age is or above 60 when the claim is filed, 0 otherwise. 0.005 0.072 0 1
occupation A standard classification of occupation type from 1 to 6 with the greater number
corresponding to greater risk. 2.109 0.847 0 4
marital Equals 1 if married, 0 otherwise. 0.614 0.487 0 1
income Individual annual income. 60,885 49,831 6,000 500,000 Characteristics of healthcare provider and service
hosp_type
Hospital type equals 3 if it’s a grade III-A hospital, equals 2 if it’s a grade III
hospital, equals 1 if it’s a grade II-A hospital, and 0 otherwise. The grade III-A
hospitals are the best ones in China.
1.890 1.048 0 3
hosp_rec Equals 2 if the hospital is on the recommendation list of insurer, equals 1 if it is
assigned hospital of insurer, and 0 otherwise. 1.038 0.712 0 2
hosp_rec_dummy1 Equals 1 if the hospital is a qualified hospital of insurer but not on the 0.492 0.500 0 1
12
recommendation list, 0 otherwise.
hosp_rec_dummy2 Equals 1 if the hospital is not a qualified hospital of insurer, 0 otherwise. 0.235 0.424 0 1
hosp_day Number of days that the insured stayed in hospital this time. 14.020 13.900 0 218
hosp_day_pre Number of days for previous hospital stays under this policy. 0.541 3.635 0 72
tot_cost Total expenditure. 8,209 18,943 262 476,385
bed_per Percentage of expenditure on bed cost. 0.083 0.117 0.000 1.000
med_per Percentage of expenditure on medicine. 0.453 0.223 0.000 1.000
care_per Percentage of expenditure on care (labor). 0.017 0.029 0.000 0.500
diag_per Percentage of expenditure on diagnosis service. 0.013 0.036 0.000 0.502
treat_per Percentage of expenditure on treatment. 0.184 0.162 0.000 1.000
test_per Percentage of expenditure on lab test. 0.193 0.152 0.000 0.900
oper_per Percentage of expenditure on operation cost. 0.058 0.121 0.000 0.647
Characteristics of the policy
coverage_type The level of coverage (corresponding to levels in table 1). 1.130 0.380 1 3
self_policyholder Equals 1 if the insured is the policy holder and 0 otherwise. 0.563 0.496 0 1
renew The total number of years since the insured first purchased this product. 2.980 1.770 1 7
num_other_policy Number of valid policy the insured purchased from other insurance company. 0.078 0.318 0 3
self_claim Equals 1 if the insured filed the claim himself, and 0 otherwise. 0.733 0.443 0 1
claim_duration Number of days between policy commencement date and hospital admission date. 193.351 97.493 0 364
file_duration Number of days between hospital admission date and claim material submission
date. 68.627 84.520 6 829
claimfreq_pre Number of claims filed prior to current claim. 0.736 1.679 0 18
13
Table 5 Summary Statistics for Two Subsamples
Variable
Observed Fraudulent Claims Observed legitimate Claims
Mean Standard
Deviation Minimum Maximum Mean
Standard
Deviation Minimum Maximum
Mean
Difference P-Value
Dependent variable
fraud1 1.000 0.000 1 1 0.000 0.000 0 0
Characteristics of the Insured
sex 0.464 0.499 0 1 0.517 0.500 0 1 0.0527 0.110
age_claim 31.369 17.612 0 60 26.06
0 20.470 0 60 -5.3095*** 0.000
child_dummy 0.240 0.428 0 1 0.392 0.489 0 1 0.1520*** 0.000
adult_dummy 0.755 0.431 0 1 0.603 0.490 0 1 -0.1519*** 0.000
elder_dummy 0.005 0.073 0 1 0.005 0.072 0 1 -0.0001 0.977
occupation 1.939 0.816 0 4 2.219 0.849 0 4 0.2799*** 0.000
marital 0.686 0.465 0 1 0.567 0.496 0 1 -0.1192*** 0.000
income 62,244 55,267 10,000 500,000 60,00
3 45,989 6,000 500,000 -2241.0330 0.496
Characteristics of healthcare provider and
service
hosp_type 1.923 1.090 0 3 1.868 1.020 0 3 -0.0553 0.424
hosp_rec 0.860 0.708 0 2 1.154 0.691 0 2 0.2940*** 0.000
hosp_rec_dumm
y1 0.480 0.500 0 1 0.500 0.500 0 1 0.0198 0.549
hosp_rec_dumm
y2 0.330 0.471 0 1 0.173 0.379 0 1 -0.1569*** 0.000
hosp_day 13.960 12.023 0 113 14.05
8 15.002 0 218 0.0978 0.915
14
hosp_day_pre 0.322 2.251 0 22 0.683 4.297 0 72 0.3613 0.132
tot_cost 10,392 27,077 262 476,385 6,792 10,565 515 145,878 -3600.7280*** 0.004
bed_per 0.088 0.170 0.000 1.000 0.080 0.062 0.000 0.613 -0.0082 0.286
med_per 0.430 0.261 0.000 1.000 0.468 0.193 0.000 1.000 0.0384*** 0.009
care_per 0.015 0.036 0.000 0.500 0.018 0.023 0.000 0.234 0.0027 0.156
diag_per 0.015 0.054 0.000 0.502 0.011 0.017 0.000 0.215 -0.0037 0.119
treat_per 0.191 0.188 0.000 1.000 0.179 0.144 0.000 1.000 -0.0116 0.278
test_per 0.181 0.146 0.000 0.801 0.201 0.155 0.000 0.900 0.0202** 0.044
oper_per 0.081 0.139 0.000 0.647 0.043 0.105 0.000 0.629 -0.0377*** 0.000
Characteristics of the policy
premium 703 275 326 2398 734 245 326 1778 31.4033* 0.065
coverage_type 1.161 0.440 1 3 1.110 0.334 1 3 -0.0514** 0.040
self_policyholder 0.633 0.483 0 1 0.517 0.500 0 1 -0.1161*** 0.000
renew 2.171 1.537 1 7 3.502 1.715 1 7 1.3307*** 0.000
num_other_polic
y 0.069 0.301 0 3 0.084 0.328 0 3 0.0153 0.466
self_claim 0.789 0.409 0 1 0.697 0.460 0 1 -0.0920*** 0.002
claim_duration 185.285 93.049 0 364 198.5
86 99.980 0 363 13.3007** 0.039
file_duration 81.277 97.303 7 829 60.41
8 74.010 6 650 -20.8592*** 0.000
claimfreq_pre 0.327 0.934 0 8 1.002 1.977 0 18 0.6745*** 0.000
15
4 Model and Methodology
4.1 Discrete-choice Model
The model takes the form of a binary probit regression with the dependent variable
equal to one if the claim is identified as a fraudulent case. Assume the following
functional relationship:
*
i i iY X e
*
iY is the latent variable. iX is a vector of the observed explanatory variables. is
a vector of unknown parameter, and ie is a disturbance term. The claim will be
determined to be fraudulent if * 0iY , otherwise it’s legitimate. Let the observed
indicator of fraud be iY , then we have:
*1, if 0
0, otherwise
i i
i
Y Y
Y
The probability of fraud is
*Pr( 1| ) Pr( 0 | )
Pr( 0 | )
Pr( )
1 ( )
i i i i
i i i
i i
i
Y X Y X
X e X
e X
F X
The probability of the claim being legitimate is
*Pr( 0 | ) Pr( 0 | )
Pr( 0 | )
Pr( )
1 ( )
i i i i
i i i
i i
i
Y X Y X
X e X
e X
F X
where ( )F is the cumulative distribution function of ie .
If we assume that ie follows a normal distribution, it is a probit model that we choose.
Let the cumulative distribution function of standard normal distribution be ( ) , then
Pr( 1| ) ( )i i iY X X
16
Pr( 0 | ) 1 ( )i i iY X X
The log-likelihood function is
1
1ln [ ln ( ) (1 ) ln(1 ( ))]
n
i i i i
i
L Y X Y Xn
(1)
Due to the sampling method and nature of our data, we improve probit model in two
directions in the following two session. Model 1 in session 4.2 addresses the over-
sampling problem and model 2 in session 4.3 attempts to address the misclassification
problem.
4.2 Probit Model with Weighted Exogenous Sampling Maximum Likelihood
Estimation
Overall, 6% of all claims in 2009 and 2010 are fraudulent, but in our sample fraudulent
cases increases to 39% because of an oversampling of fraudulent claims. To adjust
for the oversampling, we follow Manski and Lerman (1977) to include a weighted
exogenous sampling maximum likelihood (WESML) estimator. It modifies the
classic log-likelihood function and provides a consistent and asymptotically normal
WESML estimator. Artis, Ayuso and Guillen (1999) use this method to correct the
oversampling of fraud claims in auto insurance.
Consider the following specific weighted exogenous sampling likelihood function
corresponding to our model.
1 0
{ 1} { 0}
ln ( | ) ln( ) ln(1 )i i
w i i
y y
L y p p
(2)
Where,
11
2
1
1
10
2
Here 1 is the percentage of fraudulent samples in the total claims (population), and
2 is the percentage of fraudulent samples in the sample. The summary of weighted
factors are given in Table 6. We obtain the estimates by maximizing equation(2).
17
Table 6 Summary of the weighted factors
Year 1 2 1 0
2009 4.01% 34.37% 1.463 0.117
2010 5.33% 43.75% 1.683 0.122
Total 4.69% 39.36% 1.572 0.119
4.3 Maximum Likelihood Estimation with Omission Error
Detecting fraud is a classification problem. There are two types of misclassification,
but in this paper we assume that all fraudulent claims are correctly classified and the
only possible misclassifications is omission error (undetected fraudulent claims by
insurer). Following the method proposed by Hausman et al. (1998), we take the
omission error into consideration in model 2, and estimate the percentage of fraudulent
claims that are not detected. Artis, Ayuso and Guillen (2002) also applies this method
to auto insurance market.
Assume a regression model for *
iY such that:
*
i i iY X e
Let iY be a dichotomous variable indicating presence of fraud such that:
*1, if 0i iY Y
0, otherwise.iY
If there is no measurement error in the response, iY indicates the true outcome with the
following probability:
*Prob( =1| )=Prob( >0| )i i i iY X Y X
Within the misclassification frame work, assume that the observed dependent variable
could be different from the underlying outcome. Call the observed binary variable iY .
Assume that the probability of misclassification is as follows:
0 Prob( =1| =0)i iY Y
1 Prob( 0 | 1)i iY Y
18
In our specification, we assume 0 0 , and estimate 1 .
The conditional expectation of the observed dependent variable is given by:
1( | ) (1 ) ( )i i iE Y X X
Where ( ) is the cumulative distribution function of the standard normal distribution.
The corrected log-likelihood function is:
1 1
1
1ln [ ln(1 ) ( ) (1 ) ln(1 (1 ) ( ))]
n
i i i i
i
L Y X Y Xn
(3)
1 can be estimated by maximizing the log-likelihood function in equation (3).
5 Empirical Results and Discussions
Corresponding to model specifications in section 4, we consider three specifications in
our model. First, we use probit model to obtain the estimation. Second, we take into
account the effect of the over-representation of fraudulent claims in our sample. And in
the third specification, the omission error is considered.
The dependent variable is the claim decision judged by the insurance company. We treat
claim completely rejected as fraudulent claim, and therefore the dependent variable
equals to one, and zero otherwise. The explanatory variables include indicators for
fraudulent claim as well as control variables of the insured.
We perform a likelihood ratio test, and the result is 18.9 with 1 degree of freedom. This
indicates that a significant improvement occurs when we includes the omission error
parameter (specification 3), compared with the restricted model with no omission errors
(specification 2).
In specification 3, we find that the parameter 1 estimating the probability of omission
error is significantly different from zero. The result shows that the fraudulent claims are
underestimated by 4.66 percent. The complete regression results are shown in Table 7.
19
Table 7 Regression Results
SPECIFICATIONS 1: Probit 2: over-sampling
addressed
3: omission error
addressed
VARIABLES
sex 0.00518 0.0230 0.0036
(0.0973) (0.160) (0.483)
child_dummy -0.501* -0.660 -2.411*
(0.275) (0.442) (1.421)
elder_dummy 0.598 0.493 1.309
(0.660) (1.004) (2.665)
occupation -0.137* -0.0911 -0.263
(0.0751) (0.124) (0.338)
marital -0.230 -0.277 -1.578
(0.191) (0.316) (1.252)
lnincome 0.0117 0.0217 -0.218
(0.0855) (0.143) (0.469)
hosp_type 0.0297 0.0143 -0.0225
(0.0505) (0.0800) (0.17)
hosp_rec_dummy1 0.146 0.138 0.29
(0.119) (0.187) (0.414)
hosp_rec_dummy2 0.738*** 0.760*** 1.814**
(0.138) (0.230) (0.899)
hosp_day -0.0108** -0.0109 -0.042*
(0.00455) (0.00731) (0.0224)
hosp_day_pre -0.0495** -0.0532* -0.165*
(0.0203) (0.0311) (0.0885)
lntot_cost 0.249*** 0.199* 0.166
(0.0725) (0.113) (0.259)
bed_per 1.264*** 1.042 1.749
(0.490) (0.761) (1.528)
care_per -2.508 -2.510 -8.015
(1.604) (2.314) (6.27)
diag_per 3.569* 2.833 4.797
(1.824) (3.096) (7.642)
treat_per 0.00177 0.181 1.52
(0.299) (0.493) (1.628)
test_per -0.123 0.0447 0.698
(0.330) (0.534) (1.714)
oper_per 0.707* 0.895 4.767
(0.397) (0.664) (2.922)
coverage_type 0.177 0.133 0.319
(0.129) (0.206) (0.474)
self_policyholder 0.128 0.213 0.922
(0.149) (0.244) (0.66)
renew -0.319*** -0.286*** -0.785**
(0.0316) (0.0504) (0.345)
num_other_policy -0.188 -0.132 0.052
(0.148) (0.231) (0.61)
self_claim -0.0796 -0.146 -0.196
(0.175) (0.290) (0.809)
claim_duration -0.00195*** -0.00181** -0.00518**
(0.000490) (0.000807) (0.00214)
file_duration 0.00173*** 0.00216* 0.0246**
(0.000600) (0.00119) (0.0124)
claimfreq_pre -0.0795** -0.0653 -0.0318
(0.0399) (0.0567) (0.114)
20
Constant -1.193 0.436 6.115
(1.129) (1.882) (6.49)
1 - - 0.0466***
- - (0.0135)
0 - - 0
- - -
Pseudo R2 0.2305 0.2057 -
Observations 963 963 963
*** p<0.01, ** p<0.05, * p<0.1, standard errors are in parentheses
In Table 7, we find that most of the parameters’ signs are consistent with our expectation.
Table 8 lists the expected versus the obtained parameter signs.
Table 8 Comparison for the Expected and the Obtained Parameter Signs
Variable Obtained sign Expected Sign
hosp_type Inconsistent8 -
hosp_rec_dummy1 + +
hosp_rec_dummy2 + +
hosp_day - +
hosp_day_pre - -
lntot_cost + +
bed_per + indefinite
care_per - indefinite
diag_per + indefinite
treat_per + indefinite
test_per inconsistent indefinite
oper_per + indefinite
coverage_type + +
self_policyholder + indefinite
Renew - -
num_other_policy inconsistent -
self_claim - -
claim_duration - -
file_duration + +
claimfreq_pre - -
8 Inconsistent indicates the signs of coefficient are not all the same across three different specifications.
21
Most of the signs of parameter are consistent in all three specifications except for
coefficients of income (lnincome), hospital type (hosp_type), test percentage (test_per)
and number of other policies (num_other_policy). The coefficients of these four
explanatory variables are not significant though.
As shown in Table 8, we find several indicators for fraudulent medical claims. And
most of them are related to either medical service and provider, or measurement of
insurance policy.
The hosp_rec_dummy2 variable demonstrates a strong negative relationship with a
claim being fraudulent. It shows that if the insured seeks medical service in an
unqualified provider of the insurer, it’s more likely to be a fraudulent case. However,
the hosp_rec_dummy1 variable which indicates it’s a qualified provider but not being
recommended by insurer is not significant. But it does have a positive sign as expected,
showing that compared to providers recommended by insurer, those not on the
recommendation list have a higher probability of committing fraud.
Both the length of hospital stay in this time and in prior are significant indicators of
committing medical fraud. And all signs in three specifications are negative, meaning
that the longer the insured stays in the hospital this time or in prior, the lower the
probability of fraudulent claims is. The expected sign of number of hospital stay is
different from our original hypothesis. We propose two reasons. First, the longer the
hospital stay is, the higher the probability that the claim will be subjected to scrutinize
in claim handling, therefore the insured who plan to commit fraud will choose to keep
the hospital stay in a reasonable limit. Second, there is coverage limit for bed charge
that could be reimbursed by this insurance product, therefore if it’s a planned fraud, the
fraudster will limit the length of his/her stay.
The influence of the total cost is significant at the 1 percent significance level in
specification 1 and is significant at the 10 percent significance level in specification 2.
The parameter signs in all three specifications are positive, indicating that the higher
the total cost, the higher the probability of fraudulent claims is, which is consistent with
22
our expectation.
Different from results in prior study (Shin et al., 2012), the influence of composition of
expenditure are not significant in general. Only bed charge, diagnosis expenditure and
operation cost are significant at 10 percent level in specification 1, but none is
significant when over sampling or omission error is taken into consideration. The major
reason we propose is that the prior studies either controlled for diagnosis information
or just focused on certain kind of disease (Ireson, 1997). In our sample, we have limited
number of observations and various disease types, therefore, without controlling for
disease type, the cost composition cannot be used to predict fraudulent cases.
The renew variable indicating the total number of years since the insured first purchased
this product. Consistent with our expectation, the further the insured renewed with the
same insurer, the less likely he/she commits fraud.
The variables of claim_duration and file_duration are both significant in all three
specifications, and the signs are consistent with our expectation. The claim_duration
measures the number of days between the policy commencement and hospital
admission. The negative sign shows that the insured who would like to commit fraud is
eager to forge the accidents. The file_duration measures the number of days between
hospital admission to claim material submission. The positive sign shows that insured
who spend more time on preparing the claim material are more likely to commit fraud.
The number of claims filed prior to the current claim has a negative impact on the
probability of fraud as expected, but it’s only significant at the 5 percent level in
specification 1.
In our set of control variables regarding the characteristics of the insured, most of them
are not statistically significant when omission error is considered, except for the
child_dummy. The sign of child_dummy parameter is negative, as expected, since
children are less likely to be involved in medical insurance fraud.
23
Table 9 Marginal effects
MODELS Probit Model 1 Model 2
VARIABLES
sex 0.00193 0.00282 9.253E-07
(0.0362) (0.0197) -
child_dummy -0.187* -0.0810 -6.204E-04
(0.103) (0.0543) -
elder_dummy 0.223 0.0605 3.367E-04
(0.246) (0.123) -
occupation -0.0510* -0.0112 -6.779E-05
(0.0280) (0.0152) -
marital -0.0857 -0.0341 -4.061E-04
(0.0712) (0.0386) -
lnincome 0.00435 0.00266 -5.599E-05
(0.0318) (0.0175) -
hosp_type 0.0111 0.00176 -5.780E-06
(0.0188) (0.0098) -
hosp_rec_dummy1 0.0544 0.0170 7.454E-05
(0.0443) (0.0230) -
hosp_rec_dummy2 0.275*** 0.0933*** 4.668E-04
(0.0511) (0.0285) -
hosp_day -0.00401** -0.00133 -1.080E-05
(0.00169) (0.00090) -
hosp_day_pre -0.0184** -0.00654* -4.238E-05
(0.00753) (0.00384) -
lntot_cost 0.0926*** 0.0244* 4.259E-05
(0.0270) (0.0139) -
bed_per 0.471** 0.128 4.499E-04
(0.183) (0.0920) -
care_per -0.934 -0.308 -2.062E-03
(0.598) (0.282) -
diag_per 1.330* 0.348 1.234E-03
(0.681) (0.375) -
treat_per 0.000660 0.0223 3.911E-04
(0.112) (0.0606) -
test_per -0.0459 0.00549 1.795E-04
(0.123) (0.0655) -
oper_per 0.263* 0.110 1.226E-03
(0.148) (0.0809) -
coverage_type 0.0660 0.0163 8.210E-05
(0.0481) (0.0253) -
self_policyholder 0.0475 0.0261 2.372E-04
(0.0555) (0.0299) -
renew -0.119*** -0.0351*** -2.019E-04
(0.0117) (0.00634) -
num_other_policy -0.0699 -0.0162 1.339E-05
(0.0551) (0.0285) -
self_claim -0.0297 -0.0179 -5.051E-05
(0.0650) (0.0357) -
claim_duration -0.000728*** -0.000223** -1.333E-06
(0.000182) (0.0000988) -
file_duration 0.000645*** 0.000265* 6.320E-06
(0.000224) (0.000144) -
claimfreq_pre -0.0296** -0.00802 -8.180E-06
(0.0148) (0.00702) -
24
*** p<0.01, ** p<0.05, * p<0.1, robust standard errors clustered by groups are in parentheses
Marginal effects at the means of independent variables are reported in Table 9. We note
that the marginal effect in specification 3 is very small, compared to the other 2 models.
The underlying reason is that our latent variable *Y in specification 3 is higher
compared with the ones in specification 1 and 2. In a probit model, the probability of a
case being fraudulent is ( )X ,therefore the marginal effect of iX is ( ) iX , in
which ( ) denotes the density function of a standard normal distribution. X
represents the latent variable *Y ,and could be calculated after a regression assuming
each of iX taking its mean. As we taking both over representation and omission error
into consideration in specification 3, the predicted *Y becomes larger, resulting in the
marginal effect to be diminishing, as shown in Figure 1.
Figure 1: Marginal Effects in Different Specifications
To check the adequacy of our models, we report the classification results in Table 10,
11 and 12. We chose the threshold of predicting fraudulent claim using a grid search
framework, and we made compromise between the best classification in whole sample
and the best classification within fraudulent cases.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-2.5
-2.3
-2.1
-1.9
-1.7
-1.5
-1.3
-1.1
-0.9
-0.7
-0.5
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
1.5
1.7
1.9
2.1
2.3
2.5
Y
Y* in specification 2
Y* in specification 1
Y* in specification 3
( )Y
25
Table 10 Classification Table for Specification 1
Predicted Type
Legitimate Fraudulent Total
Observed Type
Legitimate 474 110 584
Fraudulent 145 234 379
Total 619 344 963
When the estimated probability of fraud exceeded 0.5,
the predicted type was fraud.
Table 11 Classification Table for Specification 2
Predicted Type
Legitimate Fraudulent Total
Observed Type
Legitimate 317 267 584
Fraudulent 42 337 379
Total 359 604 963
When the estimated probability of fraud exceeded 0.8,
the predicted type was fraud.
Table 12 Classification Table for Specification 3
Predicted Type
Legitimate Fraudulent Total
Observed Type
Legitimate 339 245 584
Fraudulent 54 325 379
Total 393 570 963
When the estimated probability of fraud exceeded 0.9,
the predicted type was fraud.
In specification 1 (the basic probit model), using threshold of 0.59, the total percentage
of observations being correct classified was 74 percent, which is acceptable. The
conditional percentage of legitimate claims that were correctly classified was 81 percent.
However, the conditional percentage of fraudulent claims that were correctly classified
was only 62 percent, showing that the probit model without weighted sampling and
omission error is not ideal for detecting medical insurance fraud.
In specification 2, the threshold was set to 0.8 since it yields the highest overall
classification percentage while keeping the correctly classified fraudulent cases above
85%. In this case, the conditional percentage of fraudulent claims being correctly
classified was about 89 percent and the percentage of legitimate claims being correctly
9 For a complete result of threshold grid search, please refer to appendix.
26
classified was 54 percent. Overall, 68 percent of observations are correctly classified.
In this way, the model is more effective in detecting fraud than the basic probit model.
Using the same criteria as in specification 2, the threshold was set to 0.9 in specification
3 to yield the best compromise between overall performance and the segment of
fraudulent claim. The conditional percentage of fraudulent claims being correctly
classified was about 86 percent and the percentage of legitimate claims being correctly
classified was 58 percent. The total percentage of correct classification was 67 percent,
which is acceptable in terms of both adequacy and efficiency in detecting the medical
insurance fraud.
6 Concluding Remarks
Health insurance fraud causes higher insurance prices and significant welfare loss to
society, therefore, detecting fraud is important for improving efficiency in the insurance
industry. The fraud detection techniques have been studied extensively by both
academics and industry analysts, yet most empirical studies focus on fraud in health
insurance in developed countries and there is little evidence on the nascent commercial
health insurance market in China.
We use a discrete choice model considered for over-sampling and omission error to
identify the predictive factors of medical insurance fraud, and we find hospital’s
qualification, total cost of healthcare, policyholder’s renewal status, claim duration and
file duration are contributing factors of medical insurance fraud.
Our research provide a significant contribution by broadening the understanding of
predictive variables for health insurance fraud in China. We expect our analysis to
help insurers in China to better evaluate their claims and improve the efficiency and
accuracy of claim management.
27
Appendix:
Grid search result for thresholds in classification is shown in table 13.
Table 13 The percentage of correctly classified claims
under different levels of threshold
Threshold
Correctly
classified % 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95
Specification 1
total* 73.31 73.94 73.52 73.52 72.79 71.34 69.68 66.87 65.01 - - -
fraudulent* 75.46 68.87 61.74 55.67 48.28 39.05 30.34 21.11 14.78 - - -
legitimate* 71.92 77.23 81.16 85.10 88.70 92.29 95.21 96.58 97.60 - - -
Specification 2
total 44.65 47.04 48.08 50.47 54.00 57.42 59.81 63.86 67.91 71.13 73.10 72.48
fraudulent 100.00 99.74 99.47 98.94 98.42 97.89 96.83 94.20 88.92 81.00 70.71 50.40
legitimate 8.73 12.84 14.73 19.01 25.17 31.16 35.79 44.18 54.28 64.73 74.66 86.82
Specification 3
total 55.24 56.49 58.15 58.98 60.44 61.99 63.03 64.80 65.94 67.71 68.95 70.61
fraudulent 99.47 98.94 98.68 98.42 98.15 97.89 97.36 96.04 93.93 91.29 85.75 67.28
legitimate 26.54 28.94 31.85 33.39 35.96 38.70 40.75 44.52 47.77 52.40 58.05 72.77
Note:
Total: The total percentage of observations being correct classified.
Fraudulent: The percentage of fraudulent claims being correctly classified.
Legitimate: The percentage of legitimate claims being correctly classified.
Reference:
Ai, J., P. Brockett, and L. Golden (2009) “Assessing Consumer Fraud Risk in Insurance
Claims: An Unsupervised Learning Technique Using Discrete and Continuous
Predictor Variables”, North American Actuarial Journal, 13(4):439-458.
Artís, M., M. Ayuso, and M. Guillén (1999) “Modelling Different Types of Automobile
Insurance Fraud Behaviour in the Spanish Market”, Insurance: Mathematics and
Economics, 24: 67-81.
Artís, M., M. Ayuso, and M. Guillén (2002) “Detection of Automobile Insurance Fraud
with Discrete Choice Models and Misclassified Claims”, The Journal of Risk and
Insurance, 69(3):325-340.
Belhadji, El Bachir, G. Dionne, and F. Tarkhani (2000), "A Model for the Detection of
Insurance Fraud", The Geneva Papers on Risk and Insurance, 25(4):517-538.
Brockett, P. L., R. Derrig, L. Golden, A. Levine, and M. Alpert (2002), The Journal of
Risk and Insurance, 69(3): 341-371.
Brockett, P. L., X. Xia, and R. A. Derrig (1998) "Using Kohonen's Self-Organizing
28
Feature Map to Uncover Automobile Bodily Injury Claims Fraud", The Journal of Risk
and Insurance, 65(2): 245-274.
Caudill, S. B., M. Ayuso, and M. Guillén (2005) “Fraud Detection Using a Multinomial
Logit Model with Missing Information”, The Journal of Risk and Insurance, 72(4):
539-550.
Derrig, R. A. (2002) “Insurance Fraud”, The Journal of Risk and Insurance, 69(3): 271-
287.
Derrig, R.A., and K.M. Ostaszewski (1995),"Fuzzy Techniques of Pattern
Recognition in Risk and Claim Classification", The Journal of Risk and Insurance,
62(3), 447-482.
Hausman J. A., J. Abrevaya, and F. M. Scott-Morton (1998) “Misclassification of the
Dependent Variable in a Discrete-response Setting”, Journal of Econometrics, 87: 239-
269.
He, H., J. Wang, W. Graco, and S. Hawkins (1997),"Application of Neural Networks to
Detection of Medical Fraud", Expert Systems with Applications, 13(4): 329-336.
He, H., W. Graco, and X. Yao (1999), "Application of Genetic Algorithm and k-Nearest
Neighbour Method in Medical Fraud Detection", 2nd Asia-Pacific Conference on
Simulated Evolution and Learning (SEAL 98), Nov. 24-27, 1998.
Ireson, C. L (1997), "Critical Pathways: Effectiveness in Achieving Patient Outcomes",
Journal of Nursing Administration, 27(6): 16-23.
Kou, Y., C. Lu, S. Sirwongwattana, and Y. Huang (2004),"Survey of Fraud Detection
Techniques", International Conference on Networking, Sensing & Control Taipei,
Taiwan, March 21-23, 2004.
Li, J., K. Huang, J. Jin, and J. Shi (2008) "A Survey on Statistical Methods for
Healthcare Fraud Detection", Health Care Manage Science, 11:275-287.
Liou, F., Y. Tang, and J. Chen (2008), "Detecting Hospital Fraud and Claim Abuse
through Diabetic Outpatient Services", Health Care Manage Science, 11:353–358.
Major, J. A. and D. R. Riedinger (2002), "EFD: A Hybrid Knowledge/Statistical-Based
System for the Detection of Fraud", The Journal of Risk and Insurance, 69(3):309-324.
Manski, C. and S. R. Lerman (1977) “The Estimation of Choice Probabilities from
Choice Based Samples”, Econometrica, 45(8):1977-1988.
Mao, L. (2008) “Research on the Health Insurance Anti-fraud in China”, working
paper, http://www.docin.com/p-224528482.html.
Ortega, P. A., C. J. Figueroa and G. A. Ruz (2006), "A Medical Claim Fraud/Abuse
Detection System based on Data Mining: A Case Study in Chile", In Proceedings of
International Conference on Data Mining, Las Vegas, Nevada, USA.
Shin, H., H. Park, J. Lee, and W. C. Jhee (2012), "A Scoring Model to Detect Abusive
29
Billing Patterns in Health Insurance Claims", Expert Systems with Applications,
39:7441-7450.
Stefano, B., and F. Gisella (2001), "Insurance Fraud Evaluation A Fuzzy Expert
System", 2001 IEEE International Fuzzy Systems Conference.
Viaene, S., R. A. Derrig, B. Baesens, and G. Dedene (2002), "A Comparison of State-
of-the-Art Classification Techniques for Expert Automobile Insurance Claim Fraud
Detection", The Journal of Risk and Insurance, 69(3):373-421.
Weisberg, H. I., and R. A. Derrig (1991),"Fraud and Automobile Insurance: A Report
on the Baseline Study of Bodily Injury Claims in Massachusetts", Journal of Insurance
Regulation, 9: 497-541.
Weisberg, H. I., and R. A. Derrig (1995),"Identification and Investigation of Suspicious
Claims, in: AIB Cost Containment/Fraud Filing", (DOI Docket R95-12) (Boston, Mass.:
Automobile Insurers Bureau of Massachusetts).
Weisberg, H. I., and R. A. Derrig (1998),"Quantitative Methods for Detecting
Fraudulent Automobile Bodily Injury Claims", Risques, July-September: 35: 75-99.
Yamanishi, K., J. Takeuchi, G. Williams, and P. Milne (2004), "On-line Unsupervised
Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms", Data
Mining and Knowledge Discovery, 8:275-300.
Yang, W., S. Hwang (2006), "A Process-mining Framework for the Detection of
Healthcare Fraud and Abuse", Expert Systems with Applications, 31:56-68.