Upload
hugo-elliott
View
218
Download
2
Tags:
Embed Size (px)
Citation preview
Advanced Data Analytics in Fraud Identification
February 23, 2015
22
► Current challenges
► Data analytics defined
► What are clients saying?
► Technology
► Case Examples
► Questions
Agenda
33
Current challenges err…opportunities
GeopoliticalGovernment contracts
Conflict minerals Transparency
Reputation
Cyber Sanctions
Litigation
Labor relationsCustomer complaints
Investigations
Tax evasionOff shore
HSEFines
Self certification
Corruption
Suppliers
BribesPolitically Exposed Persons
4
United States casesUS Pending investigations
► As of July 2014, 106 publicly disclosed investigations pendingABM Industries Incorporated Delphi Automotive PLC Key Energy Services Inc.. Qualcomm Incorporated
Accenture PLC Deutsche Bank AG Kimco Realty Corporation Quanta Services Inc..Agilent Technologies Inc.. Deutsche Post AG (DHL) KKR & Company LP Rolls Royce PLCAirbus Group DreamWorks Animation SKG Inc.. Las Vegas Sands Corp Sanofi SAAlstom SA Dun & Bradstreet Corporation Layne Christensen Company SBM Offshore NV
Analogic Corporation Eli Lilly and Company Mead Johnson Nutrition CompanySciclone Pharmaceuticals Inc..
Anheuser-Busch InBev SA/NV Embraer SA Merck & Co Inc..Sensata Technologies Holding NV
AstraZeneca PLC Ericsson AB Microsoft Corporation Siemens AG
Avon Products Inc. Expro International Group PLCMondelēz International Inc. (formerly Kraft Foods Inc.) SL Industries Inc.
Barclays PLC FedEx Corporation Morgan StanleySmith & Wesson Holding Corporation
Beam Inc.Freeport-McMoRan Copper & Gold Inc. Motorola Solutions Inc. Société Générale SA
BHP Billiton LtdFresenius Medical Care AG & Co KGaA MTS Systems Corporation Sony Corporation
Bio-Rad Laboratories Inc. General Cable Corporation National Geographic STR Holdings Inc.
Blackstone Group LP GlaxoSmithKline PLC NCR CorporationTata Communications Limited
Bristol-Myers Squibb Company Gold Fields Limited Net 1 UEPS Technologies Inc. Tesco CorporationBrookfield Asset Management Inc. Goldman Sachs Group Inc. News Corporation TeliaSonera AB
Bruker Corporation Goodyear Tire and Rubber Company Nordion Inc.Teva Pharmaceutical Industries Limited
BSG Resources LtdGrifols SA (Talecris Biotherapeutics Holdings Corp) Novartis AG UBS AG
Central European Distribution Corporation Halliburton Company
Och-Ziff Capital Management Group LLC
Universal Entertainment Corp
Chestnut Consulting Inc. Harris Corporation Olympus CorpUniversal Music Group (Vivendi)
Cisco Systems Inc. Hyperdynamics Corporation Oracle CorporationViacom (Paramount Pictures)
Citigroup Inc. Image Sensing Systems Inc. Orthofix International NV VimpelCom LtdCredit Suisse Group AG Ingersoll-Rand PLC Owens-Illinois Group Inc. Wal-mart Stores Inc.
Cobalt International Energy Inc.International Business Machines Corporation Panasonic Corporation Walt Disney Company
Comcast (NBCUniversal Inc.) Johnson Controls Inc. PTC Inc. WS Atkins PLC (PBSJ Corp)Cubist Pharmaceuticals Inc. (Optimer Pharmaceuticals Inc.) JPMorgan Chase & Co Park-Ohio Industries Inc.Dialogic Inc. Juniper Networks Protective Products of America Inc.
55
Forensic Data Analytics Defined
66
Forensic Data AnalyticsAnalytics Defined
Forensic Data Analytics (“FDA”): Refers to the multi-disciplinary service for the efficient and cost effective identification of relevant information in large-scale client datasets for a wide range of activity including the analysis of data to obtain meaningful insights for investigative, legal, regulatory, anti-fraud or risk mitigation matters.
FDA combines the extensive use of data, statistical and quantitative analysis, and explanatory and predictive models to guide and identify issues and areas warranting further review.
Outputs from FDA can allow companies to generate legally defensible fact-based evidence in order to drive decisions and focus investigative efforts where they matter, with the aim of achieving favorable outcomes.
7
FDA supports the Corporate Integrity & Compliance Framework
Forensic Data Analytics
Corporate Integrity & Compliance Framework
ProtectDetect Respond
Reactive
Risks
Proactive
88
How is fraud detected?
Source: ACFE 2010 Report to the Nations On Occupational Fraud
48.5% by tipor accident
99
Forensic Data Analytics Maturity
Focus Capabilities (in order of maturity)
Forensic single version of the truth
“Gather the hay into a haystack”
1. Forensic data discovery & extraction
2. Data joining and filtering
Errant behaviour detection
“Find the needle”
3. Application of rules4. Matching to external data (e.g.
sanctions lists)5. Unstructured text mining (e.g. key
word search)6. Entity analytics7. Statistical analysis & anomaly
detection (structured & unstructured data)
Ongoing monitoring & protection
“Prevent the needle being lost in the first place”
7. Visualisation and drill-down8. Case management and feed-back
look
Higher Detection
Rate
Lower False
Positive Rate
10
Understanding the FDA approach
TRANSACTIONAL DATA
MASTER & REFERENCE DATA
BUSINESSINTELLIGENCEDATA
SOCIAL MEDIADATA
structuredunstructured
Database Management Tools
11
Understanding the FDA approach
TRANSACTIONAL DATA
MASTER & REFERENCE DATA
BUSINESSINTELLIGENCEDATA
SOCIAL MEDIADATA
Rules-based tests
Text mining & search
Big data and/or SQL server data processing platform
structuredunstructured
DetectionTools
VISUALIZATION & RISK RANKING
Database Management Tools
12
Understanding the FDA approach
TRANSACTIONAL DATA
MASTER & REFERENCE DATA
BUSINESSINTELLIGENCEDATA
SOCIAL MEDIADATA
Rules-based tests
Text mining & search
Big data and/or SQL server data processing platform
structuredunstructured
Statistical anomalies& Predictive
Pattern Matching
Case Manager, Task Delegation and Data Refresh / ScriptingAutomation
InvestigationTools
VISUALIZATION & RISK RANKING
DetectionTools
Database Management Tools
13
Understanding the FDA approach
TRANSACTIONAL DATA
MASTER & REFERENCE DATA
BUSINESSINTELLIGENCEDATA
SOCIAL MEDIADATA
Rules-based tests
Text mining & search
CaseReview
Big data and/or SQL server data processing platform
structuredunstructured
Statistical & Predictive
Pattern Matching
Case Manager, Task Delegation and Data Refresh / ScriptingAutomation
InvestigationTools
VISUALIZATION & RISK RANKING
DetectionTools
Database Management Tools
14
Understanding the FDA approach
TRANSACTIONAL DATA
MASTER & REFERENCE DATA
BUSINESSINTELLIGENCEDATA
SOCIAL MEDIADATA
Rules-based tests
Text mining & search
CaseReview
Management
Review
Big data and/or SQL server data processing platform
structuredunstructured
Findings &Recommendations
Statistical & Predictive
Pattern Matching
Case Manager, Task Delegation and Data Refresh / ScriptingAutomation
InvestigationTools
VISUALIZATION & RISK RANKING
DetectionTools
Database Management Tools
15
Understanding the FDA approach
TRANSACTIONAL DATA
MASTER & REFERENCE DATA
BUSINESSINTELLIGENCEDATA
SOCIAL MEDIADATA
Rules-based tests
Text mining & search
CaseReview
Management
Review
Big data and/or SQL server data processing platform
structuredunstructured
Findings &Recommendations
Statistical & Predictive
Pattern Matching
Transaction review, Risk ranking, Case Manager, Data Refresh / ScriptingAutomation
InvestigationTools
Repeat the process: Continuous Monitoring
(On-Site, centralized, outsourced)
VISUALIZATION & RISK RANKING
DetectionTools
Database Management Tools
1616
What Are Clients Saying?
1717
What Are Clients Saying?FDA Technology
1818
What Are Clients Saying?FDA Challenges
Uncertainty about the relevance of FDA in the Company
FDA producing positive results to indicate and prove any fraud or bribery that is occurring
FDA is not prevalent to the culture
Huge volume of data to analyze
To identify fraudulent information across large data sets
Lack of human resources or manpower to operate FDA
Spreading the FDA culture across different Business Units
Difficulty in adapting FDA to comply with different regulations in various markets
Poor quality or lack of accuracy in the data
To prevent fraud rather than discover fraud
FDA is too expensive
Convincing senior management or the company about the benefits of FDA
Improving the quality of the analysis process
Challenges with combining data across various IT systems
Getting the right tools or expertise for FDA
0% 5% 10% 15% 20% 25% 30%
2%
3%
3%
4%
5%
5%
6%
6%
8%
9%
10%
10%
15%
15%
26%
With respect to forensic data analytics, what would you say is your single biggest challenge or requirement in your organization?
1919
Technology
2020
► Keyword library
► Concept analysis
► Entity extraction
► Text is everywhere
TechnologyText Analytics
2121
Document/data review analytics
► Concept induction and linguistic analysis► Keywords and Ontologies
► Emotive tone and ethical issue detection
► Topic modeling, concept mining, entity extraction
► Social network and actor analysis► Centrality and proximity
► External domains, family and friends
► Ontologies
2222
Ontologies
► Keywords alone are not as effective to reliably identify key concepts
► Ontologies allow us to capture concepts appearing in unstructured textual data
► Can be developed in automated or manual ways, and are reusable across engagements
2323
Stock ontologies
►Non-Responsive (1,832 classifiers; 19,188 terms)►NR Business
►Resumes, “doughnuts in the kitchen,” newsletters, itineraries
►NR Junk►Spam, fantasy football, baby pictures
►Emotive Tone (66 classifiers; 4,101 terms)►Angry, Confused, Secretive, Surprised
►Ethical Issues (419 classifiers; 5,042 terms)►Discrimination, workplace safety, price fixing,
personal problems
2424
Technology Risk Scoring
Filter by selected analytics
Review breaches on targeted analytics
2525
TechnologyGeocoding and Heat Maps
Hotspots of activity are
easily identified
Identify global epicenters of activity, as well as anomalies
2626
WHO`
WHAT
WHEN
WHY
• People-to-people analysis
• Entity-to-entity analysis
• Map communication lines to organization chart
• Top words mentioned
• Key concepts / topics
• Top or unusual dollar amounts
• Sensitive words / phrases
• When communications occur
• Communication spikes around key business events
• Positive vs. Negative Sentiment
• Top 10 negative journal entries
• Top 10 angry emails
• Top 10 most concerned emails
• Customer survey analysis
• Employee survey analysis
Who is talking to whom?
Social Networking
Concept Clustering
Communication Over Time
Sentiment Analysis
about what?
over which time period?
how do they feel?
TechnologyCommunication Analytics
2727
TechnologyEmail Analytics
► Identify the strength of known relationships
► Identify unknown relationships
►Can be performed on email logs, don’t need actual emails
2828
Entity analytics
Unusual connection between entities
2929
Technology Social Media Analysis
3030
Email Analytics
Dec 1 04 Mar 1 05 Jun 1 05 Sep 1 05 Dec 1 05 Mar 1 06 Jun 1 06 Sep 1 06 Dec 1 06 Mar 1 07 Jun 1 07
Month/Year
0.0
0.2
0.4
Sum of Incentive Pressure Pct
0.0
0.2
0.4
Sum of Opportunity Pct
0.0
0.5
Sum of Rationalization Pct
Incentive/ pressure terms
Keyword hits as a percentage of total emails
Opportunity terms
Rationalization terms
Investigation timeframe, September to March
► Research example: FCPA bribery case
3131
Convergent analytics model
Anti-fraud library of
journal entry and cash
disbursementtests
Vendorbackgroundchecks andpoliticallyexposedpersons
EY/ACFE keywordlibrary of misappropriation &
bribery/corruption terms(multi-language)
Accelerated decision
support and dynamic reporting
Suspicioustransactionprofiling and
predictivemodeling
Text analytics:“who,” “what,”
“when,” “where”
Vendor masterEmployee master
Travel & Entertainment
Accounts PayableGeneral Ledger
3232
Emotive tone -- Secretive
From: Donna
Sent: Wednesday, November 9, 2011 10:45 AM
To: Nikki
Subject: RE: Shhhhh
Absolutely.
-----Original Message-----
From: Nikki
Sent: Wednesday, November 09, 2011 10:45 AM
To: Donna
Subject: Shhhhh
Please don't mention our convo to anyone! I am high risk so I want to be sure that everything is safe and where it should be.
3333
Scott ClaryErnst & Young LLPPrincipalFraud Investigation & Dispute
ServicesHouston, TX(713) [email protected]
Thank You