Graph Analysis for Detecting Fraud, Waste, and Abuse in ...xqzhu/courses/kmelin/graph...Graph Analysis for Detecting Fraud, Waste, and Abuse in Health-Care Data Juan Liu, Eric Bier,

Health-care expenditures in the United States exceed $2trillion a year. Driven by the market size, health carehas become an important and fast growing applica-

tion domain for data analytics. McKinsey’s influential reporton big data analytics (Manyika et al. 2011) lists health care asthe top most promising application domain. One significantproblem of health care is the loss of health-care expendituresto fraud, waste, and abuse (FWA). Figure 1 lists the amount ofimproper payment in U.S. government expenditure. In 2012,improper payments totaled about $120 billion. Health-care-related programs such as Medicaid, Medicare fee for service(FFS) and parts C and D contribute significantly, representingmore than half of the total. A separate report from the Insti-tute of Medicine (IOM) estimates the annual loss to FWA inthe health-care domain to be $750 billion (Institute of Med-icine 2012). The magnitude of the fraud problem has attract-ed many efforts from the health-care industry, the data-ana-lytics industry, and research communities to developfraud-detection systems.

Articles

SUMMER 2016 33Copyright © 2016, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602

Graph Analysis for Detecting Fraud, Waste, and Abuse

in Health-Care Data

Juan Liu, Eric Bier, Aaron Wilson, John Alexis Guerra-Gomez, Tomonori Honda, Kumar Sricharan, Leilani Gilpin, and Daniel Davies

� Detection of fraud, waste, and abuse(FWA) is an important yet challengingproblem. In this article, we describe asystem to detect suspicious activities inlarge health-care data sets. Each health-care data set is viewed as a heteroge-neous network consisting of millions ofpatients, hundreds of thousands of doc-tors, tens of thousands of pharmacies,and other entities. Graph-analysis tech-niques are developed to find suspiciousindividuals, suspicious relationshipsbetween individuals, unusual changesover time, unusual geospatial disper-sion, and anomalous network structure.The visualization interface, known asthe network explorer, provides a goodoverview of data and enables users tofilter, select, and zoom into networkdetails on demand. The system hasbeen deployed on multiple sites anddata sets, both government and com-mercial, and identified many overpay-ments with a potential value of severalmillion dollars per month.

Health-care-related programs include federal andstate government programs such as Medicaid,Medicare Advantage (Part C), Medicare FFS, andMedicare Prescription Drug Benefit (Part D). Non-health-care programs include Earned Income TaxCredit (EITC), Pell Grants, Public Housing/RentalAssistance, Retirement, Survivors and Disability Insur-ance (RSDI), School Lunch, Supplemental NutritionAssistance Program (SNAP), Supplemental SecurityIncome (SSI), Unemployment Insurance (UI), andother programs. Source: www.paymentaccuracy.gov.

Despite the substantial financial significance, thefraud-detection problem is still far from being solved.While health-care data (insurance claims, healthrecords, clinical data, provider information, and oth-ers) offers tantalizing opportunities, it also poses a

series of technical challenges. From a data represen-tation view, health-care data sets are often large anddiverse. It is common to see a state’s Medicaid pro-gram or a private health-care insurance program hav-ing hundreds of millions of claims per year, involvingmillions of patients and hundreds of thousands ofproviders of various types, for example, physicians,pharmacies, clinics and hospitals, and laboratories.Any fraud-detection system needs to be able to han-dle the large data volume and data diversity. Further-more, health-care programs and perpetrators are inthe fraud arms race and constantly make adjustmentsto gain a competitive advantage over the other side.Data patterns from both sides are dynamic. The com-plexity of the problem calls for a rich set of tech-niques to examine health-care data.

Articles

34 AI MAGAZINE

Figure 1. Improper Payments in Government Expenditure.

2004

45.83%

54.17%

72.50%

27.50%

73.81%

26.19%

49.32%

42.45%

57.55%

44.63%

55.37%

43.97%

56.03%

41.28%

58.72%

39.62%

60.38%

50.68%

31.58%

68.42%

45.83%

$0B

$10B

$20B

$30B

$40B

$50B

$60B

$70B

$80B

$90B

$100B

$110B

$120B

2005 2006 2007 2008 2009 2010 2011 2012 2013

Non health care programs

Program Category

Health care related

Imp

rop

er P

aym

ents

Years

Fiscal Reporting Year

Improper Payment Amounts (FYs 2004-2013)

Articles

SUMMER 2016 35

Health-care financials are complex, involving amultitude of providers (physicians, pharmacies, clin-ics and hospitals, and laboratories), payers (insuranceplans), and patients. To design a good fraud-detectionsystem, one must have a deep understanding of thefinancial incentives of all parties. Terry L. Leap’s bookon medical billing (Leap 2011) provides a niceoverview of each type of providers, their financialpractices, and common fraud activities. Starting fromdomain knowledge, auditors and investigators havedesigned fraud-detection rules to watch out for falseclaims. The rules cover diverse behaviors, for instance,total sum of medical service hours claimed for anygiven day close to or exceeding 24 hours, work hoursduring short holiday weeks or under adverse weatherconditions staying high, unusually high proportionof home visits, and billing of service to patients whoare already deceased. Readers may refer to Rebecca S.Busch’s Healthcare Fraud: Auditing and Detection Guide(Busch 2012), for a comprehensive list of examplerules. This methodology of comparing data against apredefined rule set is widely adopted in auditing prac-tice and works effectively, but its performance isinherently limited by subject matter expert knowl-edge, which can be inaccurate and incomplete. Fur-thermore, new fraud patterns are constantly inventedto circumvent the built-in fraud-detection rules. A dif-ferent alternative, thriving due to the recent advancesof machine learning and big-data infrastructure, is thedata-driven methodology that identifies normal pat-terns from real data and detects outliers deviatingfrom the norm. Various approaches have been devel-oped for outlier detection, such as density-basedapproaches to identify points in low probabilityregions, proximity-based approaches to identifypoints isolated from others, subspace-based methodsto identify rare classes, and supervised or semisuper-vised learning methods to learn differences betweennormal and abnormal classes (Aggarwal 2012). Com-pared to their rule-based counterparts, data-drivenapproaches are more flexible, but computationallyintense, as the search space for fraud is vast. We advo-cate a combined approach, where domain knowledgeis used to guide the search, while data-drivenmachine-learning methods do the rigorous comput-ing to improve upon expert intuition to achieve bet-ter accuracy and flexibility.

To develop our overall system, we work with XeroxServices, which provides review and auditing servic-es to a number of government health-care programsand private-sector health-insurance companies. Ourtool, known as the Xerox Program Integrity Validator(XPIV), has been deployed on multiple sites and is inuse by fraud analysts in their investigation practice.The tool provides two broad categories of functional-ities: (1) automated screening, which enables an ana-lyst to focus attention on a small list of suspectproviders, as opposed to a prohibitively large set, and(2) interactive drill down, where the analyst starts

from a suspicious individual or activity (as singledout by the automated screening components) andinteracts with the system to navigate through dataitems and collect evidence to build an investigationcase. The two categories have quite different techni-cal foci. Automated screening (1) focuses on algo-rithmic design for detecting diverse forms of anom-alies, and interactive drill down (2) focuses ondatabase indexing/caching for fast data retrieval anduser interface design for intuitive user–system inter-action. For the conciseness of this article, we do notattempt to describe the complete XPIV system, butonly describe a particular subset of techniques, name-ly graph analysis, to detect suspicious activities andrelationships. Other components of XPIV, such asoutlier detection and duplicate detection, are left outof the scope of this article and may be discussed infollow-up publications.

Graph AnalysisWe have worked with fraud analysts to understandreal needs in their investigation effort. One commonconcern that the analyst would like to have help withis to detect organized crimes. Recent years have wit-nessed crime rings migrating from illegal drug traf-ficking to the safer and far more lucrative business ofperpetrating frauds against health care. The NationalHealth Care Anti-Fraud Association (NHCAA) hasreported that, in Florida alone, government Medicareand Medicaid programs and private health-insurancecompanies have lost hundreds of millions of dollarsin recent years to criminal rings.1 Collusion amongdishonest practitioners has also become common,with fraudulent activities such as self-referrals, falseor unnecessary referrals, and kickbacks. Patients maybe involved as well. A 2013 investigative series by theCenter for Investigative Reporting and CNN2 uncov-ered rampant overbilling in California’s publiclyfunded drug rehabilitation system and fraud schemessuch as clinics recruiting homeless to pose aspatients. It is reported that California paid $94 mil-lion in fiscal years 2012 and 2013 to clinics that haveshown signs of deception or questionable billing.

To detect crime rings and collusion networks suchas the ones mentioned above, we need graph-analyt-ics methods to examine data points in relation toothers.

Compared to anomaly detection schemes, whichfocus on attributes of individuals, examining therelational aspects provides a new perspective. Oursystem is the first of its kind that allows fraud ana-lysts to detect network-based fraud. Each data set isrepresented as a large and heterogeneous graph,where nodes represent millions of patients and hun-dreds of thousands of providers, such as doctors, hos-pitals, and pharmacies, and edges represent billionsof claimed services, medications, and supplies involv-ing multientity relationships among them. In thisarticle, we describe the technical components.

Graph analysis, originally rooted in network sci-ence and graph theory, has been extended to a vari-ety of applications such as communication networks,bioinformatics, and operations research. The recentdecade has seen a rapid adoption of graph-basedtechniques to analyze large scale social interactionssuch as the world wide web and social media such asFacebook, Twitter, and LinkedIn. We have extendedthese techniques to analyze health-care data for FWAdetection. In particular, we look for four types ofanomalies in the graph: (1) suspicious individuals, (2)suspicious relationships, (3) anomalous temporalchanges and geospatial characteristics, and (4) struc-tures.

Suspicious individuals. We examine each individualentity (patient, provider, pharmacy, and so on) basedon its attributes.

Suspicious relationships in the graph. While the pre-vious type focuses on individual attributes, this typefocuses on pairwise relationships. While individualsmay appear perfectly normal, each out-of-norm rela-tionship warrants a red flag.

Anomalous temporal changes and geospatial charac-teristics in the graph. Our analysis couples graph analy-sis with temporal and geospatial analysis to look forunusual temporal changes or unusual geospatial dis-tributions.

Structures in the graph. Graph techniques can revealstructure, including clusters of doctors referring toeach other or a heavily connected group of individu-als associated with narcotics transactions. We usegraph structure analysis techniques to identifyanomalous structures.

The sections that follow provide a few concreteexamples of graph-analysis techniques for FWAdetection. Loosely speaking, graph-analysis tech-niques fall under two categories. The first category,known as the ego-net approach, focuses on individualnodes and distills features from a node’s local neigh-borhood. Features include, for instance, degree andentropy of local connectivities. We have developedego-net approaches to examine narcotics relation-ships and temporal and spatial characteristics ofpatient flow between pharmacies and providers. Thesecond category analyzes the global structure of thehealth-care relation network and looks for commu-nities sharing a common abnormal practice, or tight-knit communities that are anomalous in their aggre-gated statistics. The structural approach can identifyfraud networks such as collusion networks or organ-ized crime. The two categories combined togetherencompass both the local and the global characteris-tics. We also briefly describe network explorer, a visu-alization and user interface that supports an eagle-eye view of the entire network and an interactive drilldown of suspicious nodes within its local ego-net.

Automated screening faces the technical challengeof balancing false alarms and missed detection. Anaccurate characterization of performance, such as a

ROC curve or precision or recall metric, would begreat. However, we note that the performance met-rics are extremely hard to measure due to the highcost of investigation and the extreme class imbal-ance. Each investigation case bears a cost, rangingfrom $200 for a simple desk audit to $20,000 for atypical crime investigation. The approach of subsam-pling a set and labeling each data point to obtain theprecision or recall metric, though common in aca-demic studies, is infeasible here, as it would requirelabeling a very large set (due to class imbalance inwhich a few fraud cases are buried in the sea of regu-lar cases) and hence incur a prohibitively large inves-tigation cost. This is an inherent drawback of thefraud-detection application domain. For practicalbusiness reasons, we have designed our system toproduce high precision and low recall. We resort toempirical validation, reporting cases of findings andballpark recovery dollar amounts. Though still pre-liminary, our system is being widely used by the ana-lysts to focus their investigation effort. This under-lines the value of the network analytics methodspresented in this article. Currently we are workingwith our collaborators to integrate user feedback,such as confirmation or dismissal of red-flagged cas-es to obtain more rigorous precision metrics.

Due to HIPAA restrictions3 and other business con-straints, we cannot disclose full details such as per-sonal health information (PHI) and business identi-ties. Instead we present a high-level description, withall sample results anonymized.

Analysis of Narcotics Relationship Graphs

In this section, we illustrate graph-analysis methodsto detect suspicious individuals and suspicious rela-tionships using a concrete example of narcotics use,prescription, and sales. Narcotics is of concernbecause of the growing abuse of medications andillicit drug trafficking. In recent years, narcotics havegrown to be used recreationally, and they are highlyaddictive (Epstein 1989). Despite federal efforts torestrict narcotics prescriptions, narcotics abuse con-tinues to be a problem. In addition, narcotics can beillegally sold at a very high value because of the highdemand and limited supply. Many people who abusenarcotics illicitly obtain them from patients withlegitimate prescriptions (Radnofsky and Walker2014), so it is important to track the individualpatients that are obtaining large amounts of nar-cotics, as well as the doctors and pharmacies that arefacilitating such diversion.

Our data set consists of three types of entities:patient, doctor, and pharmacy. It is equivalent to aheterogeneous graph with three types of nodes. Foreach pairwise relationship (patient-doctor, patient-pharmacy, doctor-pharmacy), we produce a bipartitegraph. Figure 2 visualizes doctor-pharmacy relation-

Articles

36 AI MAGAZINE

ships in a real-world health-care data set. Red nodesare doctors, and blue nodes are pharmacies. To avoidovercrowding the graph, we only visualize the top3000 nodes and the top 5000 edges in terms of theirnarcotics amount. We use Fruchterman-Reingold, aphysics-based layout, to reveal clusters of doctors andpharmacies who are connected together by heavynarcotics transactions. The graph exhibits clear pat-terns. For instance, it has long been suspected byfraud analysts that doctors with questionable nar-cotics prescription practices gravitate toward phar-macies bad at gatekeeping. In the graph, we clearlysee this pattern in the provider clustering. While thesystem computes and displays the graph almostinstantaneously, it would take an analyst many hoursto perform this kind of analysis manually.

ApproachTo automate detection of suspicious entities, we havedesigned a set of features, associated with aggregatedstatistics in the bipartite graphs. Given a node n andits 1-hop neighborhood N, we have degree: |N|, thenumber of nodes in the neighborhood; weight: theaggregated total number or total amount of claimsthat a node is associated with; and entropy ratio: howevenly the node associates with entities in its neigh-borhood, in terms of total number of claims or totalamount. Mathematically

where pk is the percentage of node n’s business withneighbor k out of its total business. The summationterm is the empirical entropy, measuring the disper-sion of n’s business among its neighborhood N. Theentropy is further divided by log(|N|) to normalize tothe range [0, 1]. If n evenly distributes its businessamong N, the entropy ratio is 1. If in contrast, n doesmost of its business with one neighbor, the disper-sion is very skewed, resulting in an entropy ratioclose to 0.

Figure 3 lists the different anomalies that we lookfor in the relation graph. The anomalies fall intothree categories: individual-level anomalies (labeledI), anomalies at the relationship (edge) level (labeledR), and anomalies with unexplainable medicalbehavior (labeled B). They are shown in variousshades of gray or (if in color) red, green, and bluefonts respectively. Individual-based anomalies ofinterest include: (I1) who are the heavy consumers ofnarcotics, and where they get drugs from; (I2) whichdoctors prescribe a lot of narcotics and to whom; (I3)which pharmacies sell a lot of narcotics and towhom. These questions are easy to answer based ondegree and weight features.

Anomalous relationships may include unusuallyfocused relationship such as (R1) where a pharmacy’snarcotics sales come from an unusually small number

ERn =1

log(|N |)pk

k N

log1pk

,

of patients and prescribing doctors; (R2) a doctordirects heavy narcotics sales to several pharmacies;and (R3) a doctor prescribes narcotics to only a fewpatients. High concentration between nodes can beinterpreted as potential collusion. The entropy ratiofeature can be used here.

A consequence of this analysis is the ability quick-ly to detect fraudulent characteristics that are ofinterest to our users. For example, our users com-monly look for (R4) “shopping patients,” that is, apatient visits a large number of doctors in order toget narcotics prescriptions. By sifting through mil-lions of beneficiaries, our algorithm can save analystshours of manual search time.

Behavioral anomalies are those that are not justi-fied by medical practice. These include (B1) if apatient consumes nothing but narcotics; and (B2)whether a patient-doctor relationship is focused onnarcotics alone. In order to quantify these metrics,we also incorporate the patients’ and doctors’ claimsoutside of narcotics, and find the percentage of nar-cotics by dollar amount and number of total claims.

Anonymized Cases Under InvestigationOur data set contains medical and pharmacy claimsfrom a state Medicaid program. It consists of rough-ly 64 million claim lines from 5.2 million patients,

Articles

SUMMER 2016 37

Figure 2. Bipartite Graph.

This graph visualizes the doctor (red/ light gray) – pharmacy (blue /dark gray ) relationship regarding narcotics prescriptions and sales.

PharmacyDoctor

more than 52,000 doctors, and nearly 9,000 phar-macies. We focus on Schedule II narcotics defined bythe U.S. Controlled Substances Act.4 Schedule IIdrugs are from drug classes such as opiates, stimu-lants, and depressants, which are all highly abusable.Examples include morphine, oxycodone, and fen-tanyl. Within the data set, our graph-analysis tech-niques have identified numerous suspicious activi-ties. All findings are currently being investigated bythe state’s program integrity analysts. Here we give afew examples.

Patient P36641 is the top narcotics consumer in2013, totaling an amount exceeding $400,000. He orshe gets fentanyl prescriptions entirely from DoctorD25542. Doctor D25542 is also the top prescribingdoctor for narcotics. He/she is currently under activeinvestigation by Medicaid’s Program Integrity Office.The same analysis on 2012 data points to a top pre-scribing doctor who is now convicted.

Patient P96274 visits 26 different doctors for pre-scriptions of methadone, hydromorphone, and fen-tanyl. The total is less than $10,000, but street valuecan be 50 times higher.

Pharmacy RX13230 has annual narcotics sales of$220,000, out of which $161,000 comes from a sin-gle doctor (Doctor D19848) for a single patientP90594. This unusually strong relationship is underinvestigation.

The detection of narcotics diversion can be extend-ed to other diversion problems in health care with ahigh resale value, such as durable medical equipmentand diabetes supplies. The same anomaly-detectiontechniques, described here, are applicable in thesedomains.

Temporal and Geospatial ReasoningInteresting insights can be obtained by exploring thedynamic property of a health-care graph. We ana-lyzed the graph’s temporal characteristics to find sev-eral types of anomalies. These anomalies include sinkvertices, source vertices, and heavy links. Sink ver-tices represent providers who attract patients fromother providers at unusually high rates. Source ver-tices are providers who can’t keep their patients.Heavy links are graph edges where unusually strong

Articles

38 AI MAGAZINE

Figure 3. Anomalies in a Narcotics Relation Graph.

I1. Heavy consumerof narcotics – wherethey get drugs from

R3. Doctors prescribing narcoticsto a very small percentage of patients

R4. Patient going to multiple doctorsto get narcotics (the shopping patient)

B2. Patient/doctor relationshipfocused on narcotics

B1. Are there patientswho consume nothingbut narcotics?

R2. Unusually high volumeedge between doctor andpharmacy

R1. Pharmacies with mostnarcotics revenue from a smallpatient population

I2. Doctorsprescribing anunusually highamount

I3. Pharmacies withsignificatnt narcotics revenue Pharmacy

DoctorPatient

business relationships occur. In current practice,fraud analysts manually search for these types ofanomalous providers through SQL-like queries. Ourapproach automates the effort and aids investigatorsto identify these outliers systematically.

We analyze the temporal characteristics by repre-senting claims as a discrete time sequence ofproviders for each patient and computing transition-al probabilities using maximum likelihood estima-tion (Lee, Judge, and Zellner 1968) on empiricalobservation data. By comparing these transitionprobabilities to a baseline, we can identify source,sink, and heavy links.

Anonymized Cases Under InvestigationOur analysis shows that most patients return to thesame pharmacy repeatedly and rarely deviate fromtheir pattern. More than 80 percent of prescriptionsare filled at the same pharmacy where the previousprescriptions are filled. By comparison to this base-line, two different types of source nodes are detectedby our algorithms. The first type of source nodestends to lose patients to another specific pharmacy(that is, a sink node). For example, our analysis iden-tified two pharmacies where 85 percent of thesource’s business is later transferred to the sink. Thisis particularly unusual given that these two pharma-cies are 500 miles apart. An example of this kind isworth further investigation to determine whether thebusiness relationship between the source and sinkrepresents truly fraudulent behavior. Interestingly,some pharmacies with prior fraud convictions haveshown up to be anomalous again for this analysis.The second type of source consists of pharmacieswho spread their patients to many different pharma-cies. These source pharmacies may not necessarily beinvolved in FWA activities, but could be losing cus-tomers due to poor quality of service. Neverthelesssuch abnormal patterns are worth investigators’attention.

Geospatial AnalysisGeospatial data are another useful source of informa-tion for anomaly detection. We assume that mostpatients visit physicians and pharmacies in theirlocal cities. This is especially true for the Medicaidpopulation since Medicaid is designed to coverpatients that have economic constraints or are phys-ically immobile. Occasionally there are many benignreasons why patients might visit providers far fromhome, for example, (1) sickness or injury during trav-el, and (2) visiting specialists like a surgical oncologistfor special treatment. We focus on outlier detectionmethods using aggregated statistics as features tohelp remove the effect of these rare events.

We compute the geographical distance betweenthe physician-pharmacy pair and derive an empiricalcumulative distribution function (cdf) (Mason 1982).Typically the empirical cdf increases sharply over dis-

tance. For example, a pharmacy’s or physician’s busi-ness relationships are 50 percent within a 10-mileradius, 80 percent within a 20-mile radius, 90 percentwithin a 30-mile radius, and so on. The dashed linesin figure 4 show a set of cdfs at different percentiles.We apply DBSCAN (Ester et al. 1996), a density-basedclustering algorithm, to the empirical distributions todefine the baseline. Cdfs that are similar to each oth-er are grouped together, while cdfs that deviate dras-tically from the norm are identified as anomalies.

Anonymized Cases Under InvestigationThe thick black line in figure 4 shows an anomalouscdf of a pharmacy, where 42 percent of its businesscomes from a physician more than 400 miles away.In addition to the long distances traveled by visitingpatients, the fact that all long-distance prescriptionscome from this single physician is abnormal, whichcould be an interesting finding in its own right.

Discovering Latent Networks of Providers

Sharing Anomalous PracticesIn this section we discuss the discovery of heteroge-neous provider communities that share anomalousbusiness practices. In particular, we consider extract-ing communities of prescribing providers that areparticipating in anomalous drug sales. Within such acommunity, each individual provider’s specialty willdetermine the kinds and quantity of the prescrip-tions the provider writes. A cardiologist’s prescrip-tions will be composed of a high proportion of heart-disease-related medications whereas an oncologistwill tend to prescribe a high proportion ofchemotherapy drugs. We aim to simultaneously dis-cover provider types while detecting when the pre-scription behaviors of heterogeneous provider com-munities are anomalous. For instance, a hypotheticalcardiologist and oncologist may be interacting witha pharmacy to sell narcotics to addicted patients.While the majority of their individual prescriptionsales are consistent with their types, composed ofheart-disease and chemotherapy drugs, respectively,the narcotics sales represent a shared deviation fromthose types. We call these communities out-of-speci-fication. In order to find these communities we needa concrete definition of a provider’s type and a meansof exploiting this type definition to find anomalouscommunities in our graph. In this graph twoproviders are connected if one of them has sent atleast one prescription to the other. Edges are labeledby a vector of features computed as a function of theprescription events. Unfortunately, we cannotobserve a provider’s type, nor the communities towhich the provider belongs. We develop a proba-bilistic algorithm that discovers out-of-spec commu-nities, ranks them and their members, and outputs a

Articles

SUMMER 2016 39

description of the community’s unusual behaviors.Our model of out-of-spec communities represents

the types of individual providers, the communityeach provider belongs to, the global prescriptionbehaviors generated by these types, and communityprescription behaviors generated by communitymembers. To do this we build upon the Latent Dirich-let Allocation model (Blei, Ng, and Jordan 2003) byintroducing the concept of communities, and com-munity prescription behaviors. Each provider in thegraph is associated with both a type and a communi-ty. Provider types are associated with distributionsover the global prescription behaviors. These distri-butions provide a basis for discovering anomalouscommunities. Each provider is also associated with acommunity; a community is a connected componentof the provider graph. Together a provider’s type andcommunity impose a distribution over the prescrip-tion behaviors. The set of global prescription behav-iors and the community prescription behaviors aredistributions over the edge features. Conditioned ona provider’s type and community, the data observedin the edges is assumed to be generated from a mix-ture of global behaviors (representing what is typical)and community behaviors (representing what is pecu-

liar to the community). We implemented an infer-ence algorithm that infers each part of this represen-tation. Given the data we discover the set of providertypes, each provider’s community memberships, theset of global behaviors, and the community prescrip-tion behaviors. Intuitively, communities with mem-bers whose adjacent edges are generated from thecommunity behavior distributions are consideredanomalous. According to this intuition we developedan algorithm that ranks the discovered communitiesand the community members.

Estimating the latent variables in the graph is acomputationally demanding task. There are a lot ofvariables here: provider type, community member-ships, prescribing behaviors of individual providers,and communities. We propose a simple agglomera-tive clustering procedure that seeks to iterativelyimprove the joint likelihood score by merging adja-cent communities. To accomplish this we define amerge score that compares the log-likelihood of thecurrent communities against the log-likelihood ofthe merged community. Using this score we designeda simple greedy agglomeration procedure that isguaranteed to find a partitioning of the graph. Theresult is an efficient search for an approximate solu-

Articles

40 AI MAGAZINE

Figure 4. Example of Geospatial Anomaly.Fi 4 E l f G tit l A l

1 percentile

10 percentile

25 percentile

50 percentile

75 percentile

90 percentile

99 percentile

PHARMACY X (outlier)

400.0 500.0100.0 200.0 300.00.00.0

0.2

0.4

Cum

ulat

ive

Dis

trib

utio

n

Distance (miles)

0.6

0.8

1.0

tion. As output we receive the collection of commu-nities and their corresponding latent variable esti-mates for presentation to analysts.

We have integrated our model with two tools thatour users can employ to investigate the discoveredcommunities. The first is the network explorer (NE),which provides a high-level view of the communityset. We will describe NE features in more detail later inthe article. The second tool is the Group-in-a-Boxvisualization (Rodrigues et al. 2011) to provide visualcues of why communities and their members areanomalous. Figure 5 shows an example view of anom-alous latent communities. Communities with lowanomaly scores are removed in order to present a lesscluttered visualization. The high-level view clearlyhighlights community members that stand out withrespect to their peers. Larger nodes (red dots) repre-sent more anomalous community members. Ourusers can garner further details on demand by clickinga node to zoom into a finer graph structure.

Anonymized Cases Under InvestigationGiven a network of approximately 74,000 providerswith more than 900,000 prescription relationships,our algorithm discovers 900 communities of varyingsizes. In order to evaluate the quality of the discov-ered communities we presented our partners with alist of 40 providers. Our algorithm ranks both thecommunities and community members. To generateour list of providers we took the top five communitymembers from the top 11 communities. This is asmall set of providers, but our evaluation is limitedby the very high cost of evaluating examples (ourpartners estimate up to two hours of time are neededper provider). We asked our partners to identifywhether a member of the list represented FWA andto provide feedback on the providers. In order to pro-duce these labels our partners could make use of ahistorical case database that stores information aboutpast analysis. Of the set we presented to our partners,they identified nine individuals as being representa-tive of FWA. Seven of these providers were previous-ly identified, through weeks of painstaking search, bypartners. Importantly, two of the discoveredproviders became new cases. This was interesting giv-en that our analysis was performed on historical data.Our algorithms are able to analyze the data in hoursand provide illustrating evidence so that the analystscan build a case with ease in a day’s worth of effort.

Discovering Anomalous Structure in the Graph

In this section, we report our work in progress on anonparametric approach to discovering anomalouscommunities in the medical network. We assumethat we are given an arbitrary input graph G with

nodes being entities such as providers, hospitals,pharmacies, and patients, and the edge attributesreflecting the strength of interaction between thenodes. For concreteness, in this article we considerthe specific case of referral networks where the graphG is composed of provider nodes, and the linksbetween nodes a and b represent the total number ofreferrals between providers a and b. Given this inputgraph G, we are interested in identifying subsets ofcommunities that are anomalous. We do this in athree stage process: (1) identification of communi-ties in G; (2) extraction of features characterizingthese communities; and (3) identification of anom-alous communities using these multivariate featurerepresentations of these communities. We discusseach of these steps in detail in the sequel.

Community ExtractionAs a first step, we extract tight-knit communities inthe graph G. Community detection in a graph is awidely studied problem in the network data-miningliterature. However, most of the popular methodssuch as graph partitioning, hierarchical clustering,and spectral clustering are concerned with partition-ing the graph into disjoint sets of tight-knit nodes(Fortunato 2010). These partitioning methods how-ever are not a good fit in our particular context ofmedical networks for the reason that the entiregraph G need not be partitionable into tight-knitcommunities; rather we expect a few pockets oftight-knit communities interspersed in the graph.

As a consequence of this observation, we devel-oped an agglomeration-based partitioning schemethat only identifies the small pockets of tight-knitcommunities as opposed to completely partitioningthe set of nodes into disjoint subsets. The proposedagglomeration scheme works by building communi-ties one node at a time in a greedy fashion, andadding nodes to the communities while ensuringthat the communities remain tightly knit. Wedenote the set of communities extracted from G byC_

= {C1, C2, . . ., Ck}.After extracting the set of communities C

_ through

the proposed agglomeration scheme, we check to seewhether any of the extracted communities areanomalous. We do this in two steps. As a first step,we check the case where the very existence of com-munities is anomalous. To check this case, we com-pute the ratio of the total number of nodes in C

_rel-

ative to the total number of nodes in G.Conceptually this ratio is similar to the well-knowngraph modularity metric proposed by Newman(2006), except that this ratio is defined based onnodes, and the graph-modularity metric is definedon edges. On one hand, if the ratio is very small, itindicates that G is a network that is largely commu-nity free, and we therefore declare that all the dis-covered communities in C

_are anomalous. On the

other hand, if the ratio of the number of nodes in C_

Articles

SUMMER 2016 41

relative to G is moderate to large, then we concludethat the presence of a community in G does not indi-cate that the community is anomalous.

Feature ExtractionIn the event of the latter scenario being true, weextract several features that are of interest in order tocharacterize each of these communities, and subse-quently we look for communities that are anomalouswith respect to the extracted feature sets. In this arti-cle, we consider the following sets of features to char-acterize any given community Ci in the referral net-work: community size, community density, averagedollar amount, and average anomaly score. Com-munity size is the number of nodes in Ci. Communi-ty density is the ratio of the total number of edges inCi relative to the number of nodes. Average dollaramount is the ratio of the total dollar amount asso-ciated with the referral edges in Ci relative to thenumber of nodes. For the average anomaly score,independent of network analytics, we computeanomaly scores for all providers based on marginalstatistics and compute the average anomaly score ofa community Ci as the average of the anomaly scoresof all providers in the community in order to detectwhether a community has an abnormal concentra-tion of anomalous providers.

Anomaly DetectionIn our final step, we extract anomalous communitiesusing these features by feeding the features throughan off-the-shelf anomaly-detection method for mul-tivariate data. In this article, we use the iForest anom-aly-detection algorithm (Liu, Ting, and Zhou 2008),which is currently the state of the art. The iForestmethod detects anomalies based on the difficulty ofisolating a point from the rest of the points using ran-domly generated classification trees. The intuition isthat an outlier point is far easier to isolate than nor-mal points.

Anonymized Cases Under Investigation.We applied the described procedure to a referral net-work with about 60,000 providers. On running ouragglomeration-based partitioning algorithm, we dis-covered a total of 2432 communities. These 2432communities accounted for about 40,000 providers,or about 66 percent of the total nodes in the network.Thus, the presence of communities in this network isnot anomalous.

Subsequently, we extract community size, density,average dollar amount, and average anomaly score asfeatures for anomaly detection. On running iForest,we discovered a total of 34 anomalous communities.Five communities were flagged because of their largesize. Each of these communities had in excess of 200providers or communities, while a majority of thecommunities had an average of about 10 providers.Ten other communities were flagged for high densi-ty, another 12 were flagged for high dollar amount,and the remaining 7 were flagged for high anomalyscores. An interesting observation was that some ofthe communities were anomalous with respect tomore than one feature. For instance, one particularcommunity, which had about 400 providers, also hadan abnormally high density.

Visualization: The Network ExplorerNetwork visualization is one of the most difficultchallenges of information visualization. Many tech-niques have been proposed for small networks (withhundreds of nodes or less), but they are ill-suited foranalyzing our medical network of tens of thousandsof providers. To address the challenge, we have devel-oped a new visualization tool, the network explorer,to support the FWA detection task. Figure 6 shows thenetwork explorer’s data overview interface. Theoverview enables users to get an eagle-eye picture ofthe network and get an overall idea of how manynodes (providers, patients, or pharmacies) are suspi-cious, whether and how the nodes form clusters, andtheir respective cluster sizes. The network exploreruses a rank-by-relevance framework to visualize asubset of nodes (for example, 10,000) to control thevisualization complexity and focus visualizationresources only on noteworthy nodes. Highly suspi-

Articles

42 AI MAGAZINE

Figure 5. Anomalous Communities Discovered by the Analysis.

cious nodes identified by the anomaly-detectionschemes described in the earlier sections are high-lighted. Clustering provides visual structure that canbe easily interpreted by the users.

In the network explorer, users navigate the networkthrough filtering and selection. The right-hand panelin figure 6 demonstrates the filtering mechanism.One can filter the network by edge properties (such asthe number of referrals, as shown by the first filteringbar) or by node properties (such as the anomaly scorebased on how rare a procedure is performed, aka pro-cedure fraction anomaly score, as shown in the sec-ond filtering bar). This filtering mechanism allowsusers to query the network directly.

Given an identified suspect node, the user mayneed to explore the suspect’s network connections.In fact the ability to do local network exploration isthe second most requested requirement from ourusers (next only to the overview). For instance, one ofour users was interested in understanding whichpharmacies were driving the business from a specificprescriber, and identifying which other prescriberswere heavy customers of said pharmacy. To meet thisrequirement, the network explorer includes an ego-centric mode that lets users select one or more targetnodes and interactively explore their ego-net, that is,the k-hop neighborhood for small k. All of the fea-tures available in the overview mode also work in theegocentric mode, including filtering, dynamic group-in-a-box, and rank-by-relevance. Figure 7 demon-strates the progressive drill down and visualizationwith increasingly finer granularity. In this example,the user has clicked a node to jump into the graycluster on the top-left corner of figure 7a. By doingthis, only the 83 nodes in the cluster are shown inthe screen (figure 7b). Then the user reruns the clus-tering algorithm to discover the finer subclusterstructure shown in figure 7c and finally zooms into asubcluster with only 27 nodes (figure 7d). The nodenavigator bar on the left of figure 7 shows the usershas jumped into a cluster two times. This zooming-inability, as well as zooming out by clicking on theopen space or the node navigator bar, allows the userto navigate the network freely and effectively.

Graph Analytics in Real-World FWA Detection

We have deployed our analytics system to supportseveral business applications to detect fraud, waste,abuse, and other kinds of inappropriate billing. Theseapplications include provider review, cost contain-ment, recovery services, and prepay detection.

The goal of provider review is to find providers(doctors, hospitals, clinics, and others) who arebilling inappropriately and who will be the mostvaluable to audit, judging by the amount billed, thedegree to which the billing is inappropriate, and oth-er factors such as the extent to which patient health

is endangered. Analysis aims to maximize a valuefunction over providers or sets of providers.

The goal of cost containment is to find a proposedchange to the current claim payment rules that islikely to result in increased efficiency, decreased cost,or improved health-care outcomes. These opportu-nities focus less on individual providers, patients, orclaims and instead focus on a set of these. Hereanalysis aims to find billing patterns that are com-mon and expensive but inappropriate.

The goal of recovery services is to find individualclaims where more money was paid than shouldhave been and then to contact the associatedproviders and get the money back. For example, arecovery services call center may ask a provider torefund money if the provider was accidentally paidtwice for the same service, or if another insurancecompany should have been billed first. Analytics forrecovery services focuses on overbilling that can beproven easily and then tries to find as manyinstances as possible.

The goal of prepay detection is to identify inap-propriate claims before the provider is paid for thoseclaims. For any given claim, a prepay algorithmdetermines whether the claim should be rejected,sent to a human analyst for further study, orprocessed normally.

We work with teams that provide services organ-ized around the business applications mentionedabove. In that work, we use our deployed system toprovide analytics reports and interactive softwarethat can be shared with analysts performing providerreviews, cost containment, and recovery. Our part-ner teams, through their interaction with thedeployed system, give us feedback on algorithms,reports, and software, allowing us to improve themiteratively. In addition, improvements made to sup-port one team often support others. Our analyticshave already been used to find many overpaymentsincluding provider review and cost containment cas-es with a potential value of several million dollarsand recoverable claims with a potential value ofroughly a million dollars per month.

Our graph analytics support three of the fourkinds of service. For provider review, one way to findsuspicious providers is to look at the graph of rela-tionships between providers, such as patient referralsand shared patients. If providers are colluding todefraud the system, that will show up in this graph.Likewise, providers and patients may collude to billinsurance payers for drugs or supplies and then sellthem on the street. In cost containment, a providerbilling too much for one patient will often bill toomuch for other patients as well. Patterns in theprovider-patient network, then, can uncover sys-temic overbilling that can be addressed by a rulechange. In prepay detection, when making a deci-sion about a new claim, the algorithm can look atpatient-provider, provider-provider, and patient-

Articles

SUMMER 2016 43

patient relationships together with informationabout particular providers, patients, and claim fea-tures that have been associated with overbilling inthe past to recommend human review of someclaims.

Discussion: Deployment and Evaluation

We have applied modular programming principles todesign our anomaly-detection components as inde-pendent and interchangeable algorithmic pieces thatcan be composed or rewired to operate on diversedata sets. For instance, the same set of graph featureextraction algorithms described in the Narcotics Net-work section can be applied to data sets regardingdiabetic supply or durable medical equipment toidentify potential perpetrators. The modularity and

reusability have significantly reduced deploymentcost. The remaining major deployment workload ison data ingestion and schema adaptation. Dataingestion needs to take into account the multipledata representations coexisting in different parts ofthe data. For instance, providers are often identifiedusing National Provider Identifier (NPI) numbers, butpharmacies, which are also one type of provider, aresometimes identified using their NABP (NationalAssociation of Board of Pharmacy) numbers. Like-wise, patient identity may be represented by severalinsurance plan/program numbers. U.S. medical prac-tice is transitioning its diagnosis code system fromICD-9 to ICD-10, and hence almost all data sets see amixture of both. We customize ETL (Extract/Trans-fer/Load) code to unify the differences and converteach data set’s native data representation to the data-set-independent representation that our algorithmmodules expect. Depending on data complexity, the

Articles

44 AI MAGAZINE

Figure 6. Network Explorer Main Interface with 10,000 Nodes.

data ingestion and schema adaptation effort mayamount to days of work. Amortized over the lifetimeof the XPIV, the deployment cost is fairly low.

As we create our algorithms, we evaluate themagainst three levels of benefit: (1) productivity bene-fits, (2) human-level quality on results with a reduceddetection time, and (3) greater-than-human-levelquality on results. Our evaluations run from infor-mal, such as user testimonials, to formal, such as cal-culation of precision in finding overpayments.

For example, one Xerox partner wrote, “Usingthese technologies will improve the selection of audittargets, which has a direct impact to revenue on thesecontingency-based contracts.” Another Xerox part-ner wrote, “Interesting flag. . . . So it has a high posi-tive hit rate at first pass,” and also wrote, “In the first5 minutes I identified a possible referral . . .” and also,“Without [this tool], it would have been very difficultand quite time consuming to do this research.”

On the more formal side, we have been fortunateto have analysts who are willing to go through largeresults sets, including thousands of flagged health-care claims, to see which are or are not recoverable.For example, after several iterations of improvingduplicate detection, we were able to get 100 percentprecision on a first result set based on criteria set bythe analysts. As another example, using relationshipsin a narcotics graph, we produced a list of providerswith high anomaly scores and made them availableto a group of analysts for review. Of the 39 providersreviewed so far, 8 were already under review and 1more was written up for review, for 9/39 = 23 percentsuspicious cases. Of the remaining cases, 11 were inprovider categories that are out of scope for review atthe time. Eight more were of low materiality basedon predetermined thresholds defined by the payer.

Such providers can easily be filtered out of futurereports. Removing these 19 providers from our cur-rent set would give us an updated ratio of 9/20 or 45percent. Considering a list of recommendedproviders still takes time and effort for our analysts,but it appears to be a valuable new source of candi-date providers and reduces effort compared to man-ually constructed queries and reports. As these evalu-ations indicate, our tools and algorithms have beenable to improve user productivity and allow users toproduce results that were difficult or time consumingto produce previously.

These statements speak to the impact of the systemfrom the point of view of analysts. As our programcontinues to develop we plan to augment this analy-sis to include additional measurements of systemquality. For instance, a crucial measurement in frauddetection is the rate of case identification for indi-vidual analysts. An ideal system increases this rate.

Our initial evaluations, though preliminary, sug-gest that our tool successfully improves work flows.Our future effort will determine the magnitude ofthis improvement. In addition, in coordination withour business partners we continue to construct larg-er sets of ground truth data that are crucial for pre-liminary evaluation of new analytics. We expect, incoming years, to establish empirically the robustnessof our deployed system.

ConclusionThis article presents our work on developing graph-analysis techniques and applying them to real-worldhealth-care data sets to look for fraud, waste, andabuse activities. We represent the health-care rela-tionship using heterogeneous graphs and identifying

Articles

SUMMER 2016 45

Figure 7. Users Can Jump into Clusters to Focus Their Exploration into Specific Parts of the Network.

visible

coun

t

cluste

r

visible

coun

t

cluste

r

visible

coun

t

cluste

r

visible

coun

t

cluste

r

visible

coun

t

cluste

r

visible

coun

t

cluste

r

visible

coun

t

cluste

r

visible

coun

t

cluste

r

a b dc760 760 83 760 83 760 83 27

Institute of Medicine. 2012. Best Care at Lower Cost: Trans-formation of Health System Needed to Improve Care and ReduceCosts. Report, September 6. Washington, DC: NationalAcademies of Sciences, Engineering, Medicine. iom.nation-alacademies.org/Reports/2012/Best-Care-at-Lower-Cost-The-Path-to-Continuously-Learning-Health-Care-in-Ameri-ca/Press-Release.aspx#sthash.fvodsvbX.dpuf

Leap, T. L. 2011. Phantom Billing, Fake Prescriptions, and theHigh Cost of Medicine. Ithaca, NY: Cornell University Press.

Lee, T. C.; Judge, G. G.; and Zellner, A. 1968. MaximumLikelihood and Bayesian Estimation of Transition Probabil-ities. Journal of the American Statistical Association 63 (324):1162–1179. dx.doi.org/10.1080/01621459.1968.10480918

Liu, F. T.; Ting, K. M.; and Zhou, Z.-H. 2008. Isolation For-est. In Proceedings of the Eighth IEEE International Conferenceon Data Mining, ICDM’08, 413–422. Piscataway, NJ: Institutefor Electrical and Electronics Engineers. dx.doi.org/10.1109/icdm.2008.17

Manyika, J.; Chui, M.; Brown, B.; Bughin, J.; Dobbs, R.; Rox-burgh, C.; Byers, A. H. 2011. Big Data: The Next Frontier forInnovation, Competition, and Productivity. McKinsey GlobalInstitute Report (May). New York: McKinsey & Company.

Mason, D. M. 1982. Some Characterizations of Almost SureBounds for Weighted Multidimensional Empirical Distribu-tions and a Glivenko-Cantelli Theorem for Sample Quan-tiles. Zeitschrift für Wahrscheinlichkeitstheorie und VerwandteGebiete 59(4): 505–513. dx.doi.org/10.1007/BF00532806

Newman, M. 2006. Modularity and Community Structurein Networks. In Proceedings of the National Academy of Sci-ences USA 103(23): 8577–8582. dx.doi.org/10.1073/pnas.0601602103

Radnofsky, L., and Walker, J. 2014. DEA Restricts NarcoticPain Drug Prescriptions. Wall Street Journal: 22 August.

Rodrigues, E. M.; Milic-Frayling, N.; Smith, M.; Shneider-man, B.; and Hansen, D. 2011. Group-in-a-Box Layout forMultifaceted Analysis of Communities. In Proceedings of the2011 IEEE Third International Conference on Privacy, Security,Risk and Trust (PASSAT) and Social Computing (SocialCom),354–361. Piscataway, NJ: Institute for Electrical and Elec-tronics Engineers.

Juan Liu is the lead research scientist at Medallia. She waspreviously at Palo Alto Research Center, which is where theresearch for this article was conducted.

Eric Bier is principal scientist at Palo Alto Research Center.

Aaron Wilson is a research scientist and manager at PaloAlto Research Center.

John Alexis Guerra-Gomez is a research scientist at YahooLabs.

Tomonori Honda is a senior data scientist atInflection.com.

Kumar Sricharan is a research scientist at Palo AltoResearch Center .

Leilani Gilpin is a Ph.D. student at the Massachusetts Insti-tute of Technology.

Dan Davies is a research engineer at Palo Alto ResearchCenter.

anomalous individuals, relationships, and communi-ties by analyzing the local and global characteristicsof the graphs. Our work has identified investigationtargets totaling millions of dollars of potential recov-ery for our collaborators at Xerox Services.

Our future work will take several forms. First, weplan to extend our graph-analysis techniques to scanincoming claim streams fast enough to intercept sus-picious claims before they are paid. This early detec-tion requires the graph-analysis algorithms to beoptimized for memory and computation, runningquickly on large graphs. In addition, we plan to addadditional feedback loops to our system, so thatactions taken by users of our technologies becomeinput to the algorithms. This will enable a rigorousperformance evaluation of detection precision. At thesame time, the algorithms will learn from the suspi-cious activities that users explore and mark, and theresults of audits, investigations, and recoveries. Final-ly, we will allow users to configure the analytics sothat it is easy to tune them to the needs of specialistsand repeat successful analyses on new data sets.

AcknowledgmentsWe acknowledge technical contributions from ourcolleagues John Hanley, Alan Bell, Sureyya Tarkan,Alex Brito, Ming Yang, and Nick Briggs.

Notes1. See NHCAA. 2011: The Challenge of Health Care Fraud:Consumer Alerts: The Impact of Health Care Fraud on You.Washington, DC: The National Health Care Anti-FraudAssociation (www.nhcaa.org/resources/health-care-anti-fraud-resources/the-challenge-of-health-care-fraud.aspx).

2. Infographics on the 2013 video series Rehab Racket pro-duced by CIR and CNN are available from revealnews.org.

3. The Health Insurance Portability and Accountability Actof 1996, Public Law 104-191, is available at www.cms.gov/Regulations-and-Guidance/HIPAA-Administrative-Simplifi-cation/HIPAAGenInfo/Downloads/HIPAALaw.pdf.

4. Title 21 CFR 1308.12. U.S. Department of Justice, DrugEnforcement Administration, available at www.ecfr.gov.

ReferencesAggarwal, C. C. 2012. Outlier Analysis. Berlin: Springer.

Blei, D. M.; Ng, A. Y.; and Jordan, M. I. 2003. Latent Dirich-let Allocation. Journal of Machine Learning Research 3(3): 993–1022.

Busch, R. S. 2012. Healthcare Fraud: Auditing and DetectionGuide. New York: Wiley.

Epstein, R. 1989. Drug Wars in the United States. BritishMedical Journal 299(6710): 1275–1276. dx.doi.org/10.1136/bmj.299.6710.1275

Ester, M.; Kriegel, H.-P.; Sander, J.; and Xu, X. 1996. A Den-sity-Based Algorithm for Discovering Clusters in Large Spa-tial Databases with Noise. In Proceedings of the Second Inter-national Conference on Knowledge Discovery and Data Mining,226–231. Menlo Park, CA: AAAI Press.

Fortunato, S. 2010. Community Detection in Graphs.Physics Reports 486(3): 75–174. dx.doi.org/10.1016/j.phys-rep.2009.11.002

Articles

46 AI MAGAZINE

Documents

Graph Analysis for Detecting Fraud, Waste, and Abuse in ...xqzhu/courses/kmelin/graph...Graph Analysis for Detecting Fraud, Waste, and Abuse in Health-Care Data Juan Liu, Eric Bier,