15
Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Embed Size (px)

Citation preview

Page 1: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Data Mining Application in U.S Crop Insurance Program

Alexis Garcia

ISQS 7342

Page 2: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

U.S Crop Insurance Primer

• Crop Insurance is a vital component of many farm operations throughout the nation

• Farming is an inherently risky enterprise

• Producers rely on insurance policies to protect their investments in land, livestock, seed, and crops

Page 3: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

U.S Crop Insurance Primer

• In 2004, the program provided producers with over $47B in liability protection to about 1.2 ,million policies at a cost of $3.6B

• Incur loss of $160M as a result of waste, fraud and abuse

Page 4: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Application of Data Mining

• Through ARPA 2000, the use of crop insurance data mining was funded in an effort to detect abuse and schemes in filing fraudulent insurance claims losses

• CAE-USDA RMA database incorporates weather data, soils, and other agronomical relevant factors that aids in formulating farm policy strategies

Page 5: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Application of Data Mining

• Database contains more than 2 terabytes of information and enables linkage of data across time to allow multi-year comparisons

• CAE-USDA RMA partnership produced more than 200 data mining research products among these is the SPOTCHECK Program.

Page 6: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Spotcheck Program

• Designed to identify suspicious patterns indicating possible program abuse (fraudulent insurance claims)

• Designed data mining algorithms are based on starting points such as anecdotes from the field or experience of investigators, producers, agents, or adjusters about schemes to exploit the program

• These schemes are analyzed to determine whether they occur in the national data, where and to what extent and whether or not the scheme is structured and results in personal benefit

Page 7: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Spotcheck Program

Page 8: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Results of the US Crop Insurance Data Mining Program • Decrease in the number of fraudulent crop

insurance claims

• Decrease in the amount of loss due to fraud, waste, and abuse

• RMA did manage to catch $300M in fraudulent claims between 2001 and 2004

Page 9: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Popular Articles in U.S Crop Insurance with Data Mining Application• Using Data Mining to Detect Anomalous Producer

Behavior: An Analysis of Soybean Production and the Federal Crop Insurance Program (Olson, Little, Lovell, 2003)

• Collusion in The U.S. Crop Insurance Program: Applied Data Mining (Little, Johnston, Lovell, Rejesus, Seed, 2003)

Page 10: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Using Data Mining to Detect Anomalous Producer Behavior• Develop a data mining algorithm and to apply

algorithm to identify anomalous producers and counties within Land Resource Regions (LRR) based upon the percentage of acres harvested

• LRRs are used to group spatially insured producers into agronomical homogenous groups to account for the natural resource availability.

Page 11: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Using Data Mining to Detect Anomalous Producer Behavior• Dependent variable – percentage of acres

harvested

• Other variables include state and county code, LRR, reinsurance year, crop code, practice code, acres planted, acres harvested, liability, indemnity, producer risk premium

• 625,031 unique producers over 2.58 million observations from reinsurance years 1994 to 2001

Page 12: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Using Data Mining to Detect Anomalous Producer Behavior• An exceptionally low or high percentage acres

harvested could be an indicator of anomalous producer behavior

• Smoothing the dependent variable by using the five-year moving average process

• The percentage was normalized by z-score within LRR

Page 13: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Using Data Mining to Detect Anomalous Producer Behavior• After normalization, an outlier detection method

was used to identify producers with anomalous behavior

• Identified as anomalous if they were at or below the 5th percentile and had a p<=0.01 or if at 1st percentile and had p<=0.01

• Profiling the normal vis-à-vis anomalous producers

Page 14: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Using Data Mining to Detect Anomalous Producer Behavior

Page 15: Data Mining Application in U.S Crop Insurance Program Alexis Garcia ISQS 7342

Using Data Mining to Detect Anomalous Producer Behavior