Upload
auditconferenceseurope
View
337
Download
2
Embed Size (px)
Citation preview
The Future World of Analytics for Audit and Fraud
Dan French – Founder & CEO, Consider Solutions
Audit Technology & Fraud Investigation Conference
Audit Conferences Europe (ACE)
November 3rd & 4th 2015, London
© 2015 Consider Solutions All rights reserved 1
2
Today’s Session
Dan French
Founder & CEO, Consider Solutions
© 2015 Consider Solutions All rights reserved
Mission
‐ Solutions for World Class Finance
Footprint
‐ Financial Control & Compliance
‐ Risk Assurance
‐ Process Optimization
3
Clients
© 2015 Consider Solutions All rights reserved
Context
4
© 2015 Consider Solutions All rights reserved
“The typical organization loses the equivalent of 5% of its revenues to fraud & waste each year”
Source: Global Economic Crime Survey; PwC
Agenda
• Introduction
• Challenge for Audit & Risk Assurance
• The Role of Data Analytics
• Machine Learning – The Next Generation
• Evolution
• The Future of the Audit Team?
• Q&A
5 © 2015 Consider Solutions All rights reserved
6
Challenge for Assurance
© 2015 Consider Solutions All rights reserved
The Standardisation & Control Myth
We invest heavily in ERP implementation to drive:
‐ Process standardisation
‐ Business efficiency
‐ Economies of scale
However, only some of the value gets released . . .
‐ Businesses implement standard systems and achieve
A standard data input process
NOT
A standard business process
7 © 2015 Consider Solutions All rights reserved
GRN is created against PO
Purchasing creates PO for Shipment
Truck drops off shipment, but no PO exists
Warehouse calls up Purchasing to create a PO
ERP is configured to only allow GRN if PO exists, however…
8
ERP enabled standardisation example
‘First time match’ KPI looks good despite process breakdown!
© 2015 Consider Solutions All rights reserved
9
Data Analytics Identify Exceptions
© 2015 Consider Solutions All rights reserved
Opportunity for Assurance Business Performance & Risk Management
Two sides of the same coin
For example
‐ Risk KRI
Credit check
Payment terms
Delivery quantity & quality
‐ Performance KPI
DSO
Exceptions Matter for both!
© 2015 Consider Solutions All rights reserved
Data Analytics Identify Exceptions
Purchase to Pay
‐ Duplicate Payments (fuzzy match)
‐ Retrospective POs
‐ Changing payment terms
‐ Same Bank Account usage
Order to Cash
‐ Price Changes
‐ Undelivered orders
‐ Exceptional customer credits/returns
‐ Payment terms
Fixed Assets
‐ Inappropriate asset depreciation periods
‐ Misclassified capital equipment
Travel Expenses
‐ Duplicate claims
‐ Suspicious claims
‐ Ineligible items claims
‐ Repeating amounts
Financial Close
‐ Postings into prior closed periods
‐ Manual payments
© 2015 Consider Solutions All rights reserved 11
Trading
‐ OFAC limitations
‐ Sunshine Act implications
What We Have Learned So Far
Conventional approaches are not sufficiently effective:
‐ Programmatic – need to know the rules for known anomalies
‐ Yes / no ‘red flag’ logic
‐ High proportion of ‘false positives’
‐ Periodic data sampling
‐ Inability to ask complex questions of the data
‐ Little or no context to the results
‐ Susceptible to human bias and error
‐ Need for cross-discipline business / technical skills
‐ Average detection time is too long (if detected at all)
‐ High level of effort and investment required to implement & sustain exception analytics
There is a big gap between average and best practice
Best practice is expensive in current paradigm
12 © 2015 Consider Solutions All rights reserved
Research
Guiding principles are to identify techniques that will provide
‐ Precision
Complex questions to significantly reduce false positives
Less reliance on human interpretation
Discover previously unknown anomalies
‐ Timeliness
Fast time to detection after initial occurrence
Speed of analysis
‐ Useability
No specialist / on-going scripting or programming skills for the client
Transparency of results – easy to understand what you have
‐ Efficiency
Radically cheaper approach to democratise analytics
Radically faster processing on cheap cloud computing
13 © 2015 Consider Solutions All rights reserved
Research – New Techniques
Artificial Intelligence
‐ Machine Learning
Instance Based learning
– K-Star
Baysian Learning
– Naive Bayse
– Baysian Network
Functions
– Support Vector Machines (SVM)
Time Series Analysis
– Kalman Filter
– Peer Group Analysis (PGA)
Decision Tree
– Random Forest
Deep Learning
– Recurrent Neural Network (RNN)
– Feed Forward Neural Network (FFNN)
© 2015 Consider Solutions All rights reserved
Deep Learning
Deep Learning
‐ Recurrent Neural Network (RNN)
Used for classification and regression on sequential data
Supervised / Unsupervised
Used for outlier detection
Promising initial results using for prediction of sequential data for outlier detection. Best outlier detector tested
‐ Feed Forward Neural Network (FFNN)
Used for classification and regression on static data
Supervised / Unsupervised (as one class classifier)
Classification of fraudulent expenses.
Effective at predicting expense fraud based on MP training set
© 2015 Consider Solutions All rights reserved
Supplier with unusually sporadic
payments
Payments always
processed at end of day
By user who normally
deals with one time suppliers
Flag for further
investigation
Machine Learning: Unsupervised approach
Unsupervised learning can be used to model ‘normal’ behaviour and discover anomalies. When several of these anomalies occur in the same area, it may be grounds for suspicion.
© 2015 Consider Solutions All rights reserved
Machine learning: Supervised approach
Classifier
Scheme C
Scheme A
Scheme B
Database of new
transactions
ID Fraud Scheme
720424 -
720425 -
720426 -
720427 -
720428 C
720429 -
720430 -
© 2015 Consider Solutions All rights reserved
Supervised learning can be used to label and classify known exceptions for certain fraud schemes and map these scheme models to new data and infer new exceptions.
Raw
pixels
Abstraction
Deep learning – Comprehension
© 2015 Consider Solutions All rights reserved
Recurrent Neural Networks (RNN)
Deep learning method which learns sequentially
Can be used to comprehend audio, text, video or predict time series
For example, if you give the complete works of Shakespeare to an RNN – training it to predict the 100th character given the previous 99 - you end up with a Shakespeare generator
© 2015 Consider Solutions All rights reserved
RNN: Shakespeare
This was generated a character at a time. It shows the network has:
‐ Learned how to put characters together to make (Shakespearian) English
‐ Learned simple grammar
‐ Learned the structure of how plays are written
© 2015 Consider Solutions All rights reserved
RNN Vendor X 4
5
Comparison
RNN: Uncharacteristic Invoices The RNN ingests a sequence of invoices for a specific vendor
Develops a model about what the next invoice will look like given:
‐ What it has learned about invoices in general
‐ What it has learned about this vendor specifically
By comparing the RNNs models to the actual next invoice we can flag invoices which are uncharacteristic for this vendor.
© 2015 Consider Solutions All rights reserved
22
Example #1 – Fraudulent Invoicing
The perpetrator submitted fictitious invoices from a real supplier, but changed the bank account to be their own. These invoices were processed alongside genuine invoices paid to that company. The deception was not detected by conventional methods and only came to light when the perpetrators bank notified authorities because of unusually high value transactions passing through the account.
Based on this, our research modelled a scheme to look for a small increase in transactions per month which coincided with a change in bank account details based on a data set of 50,098 invoices
© 2015 Consider Solutions All rights reserved
23
Example #1 – Fraudulent Invoicing
In isolation payment to different bank accounts are not a significant indicator:
© 2015 Consider Solutions All rights reserved
24
Example #1 – Fraudulent Invoicing
Varying invoice amounts are also not significant:
© 2015 Consider Solutions All rights reserved
25
Example #1 – Fraudulent Invoicing
The actual anomalous data is unremarkable:
© 2015 Consider Solutions All rights reserved
26
Example #1 – Fraudulent Invoicing
Using time series anomaly detection with the relevant attributes, the false invoices scored very highly compared to all other invoices and were easily detected
7 invoices from a data set of 50098, detection occurring 4 months after the first invoice
Also significant was that no false positives were identified
© 2015 Consider Solutions All rights reserved
27
Example #2 – UK MPs Expense Claims
UK MPs Expense Claims were analysed using Machine Learning and Classification technology with respect to:
‐ Expense Date, Category, Type, Cost, Description and Individual MPs expense history compared to average expense cost per category
Trained on MP Expense Claims 2010 – 2013
‐ Positive labels coming from the Legg report
‐ 677,066 claimed expense items
‐ 3,268 repaid expense items
Analysed MP Expense Claims 2013 – present
‐ 77,065 claimed expense items
‐ 206 repaid expense items (Legg Report)
© 2015 Consider Solutions All rights reserved
28
All Claimed Expenses in Green Repayments in Red = Needle in a Haystack
© 2015 Consider Solutions All rights reserved
29
Repayments Highlighted
© 2015 Consider Solutions All rights reserved
30
Threshold >15% Repayment Likelihood
© 2015 Consider Solutions All rights reserved
31
Threshold >25% Repayment Likelihood
© 2015 Consider Solutions All rights reserved
32
Threshold > 40% Repayment Likelihood
© 2015 Consider Solutions All rights reserved
33
Comparison of Repayments and Repayment Prediction of Selected MP Over Time
© 2015 Consider Solutions All rights reserved
Machine Learning Approach
Subject domains organised as “Themes & Schemes”
A multi-layered hierarchical process to create features that are interpreted by a machine learning engine:
‐ Feature creation – discovery of relationships between features and composite relationship inferences
‐ Behaviour profiles – for example how a certain organisation / person completes a document
‐ Smart feature-based rules
‐ Automated feedback for supervised classifiers to act in ensemble with their unsupervised cousins
Low cost, high performance computing
34 © 2015 Consider Solutions All rights reserved
35 © 2015 Consider Solutions All rights reserved
Machine Learning Approach
Feedback
Feature Creation
• Machine Generated - Pattern Recognition, Behaviour Profiling, Time Series, Peer Group, ...
• Domain Expertise – Conventional indicators
Classification
• Supervised – Deep Learning, Neural Network, Support Vector Machines, ...
• Unsupervised – Feature Based Smart Rules
Intelligent Scoring Algorithm
Source Data
Data Abstraction
Anomaly Detection Engine (ADE)
Results
Feedback
Current Research - P2P/AP
Based on a Risk Data Matrix, analyse and risk rate the data using an ensemble of the latest artificial intelligence and machine learning techniques in concert with some traditional “red flag” indicators. For example:
‐ Complex multi dimensional analysis across business process data
‐ Changes in behaviour of people entering invoices / payments
‐ Changes in patterns of invoices / payments over time
‐ Dissimilarity of invoices submitted by same vendor
‐ Dissimilarity of payments made to same vendor
‐ Unusual invoiced items and quantities based on previous history
‐ Unusual expense spending patterns
‐ Unusual variances for an expense item
‐ Validation against external data sources
‐ …
36 © 2015 Consider Solutions All rights reserved
Themes & Schemes
Vendors
‐ Duplicate – Exact & Fuzzy
‐ Dormant – 12, 24, 36 months
‐ Sanction List
‐ Vendor activity with no existing vendor master data
Invoices
‐ Duplicate – Exact & Near Match
‐ Top 10 Invoice Activity
Payments
‐ Duplicate
‐ Unusual bank accounts and cross-vendor duplicates
‐ Payments to Vendors are period of inactivity
‐ Invoice-Payment period outliers
37 © 2015 Consider Solutions All rights reserved
Early Research Results
38 © 2015 Consider Solutions All rights reserved
Early Research Results
39 © 2015 Consider Solutions All rights reserved
Early Research Results
40 © 2015 Consider Solutions All rights reserved
Early Research Results
41 © 2015 Consider Solutions All rights reserved
42 © 2015 Consider Solutions All rights reserved
Evolution - Inevitable, Inexorable
Manual by eye sampling
Spreadsheet based analysis
Ad hoc exception assessments
Systematic exception monitoring
Machine learning analytics
Future Role of the Audit Team?
Less Separation between IT & General Audit?
Less Need for Technical Analytics Development
Data Science opportunity
No More Sampling
More focus on business value
‐ Risk -> Diagnosis -> Root Cause Analysis
© 2015 Consider Solutions All rights reserved
Future Role of the Audit Team? Business Performance & Risk Management
Business Assurance
Two sides of the same coin
For example
‐ Risk KRI
Credit check
Payment terms
Delivery quantity & quality
‐ Performance KPI
DSO
© 2015 Consider Solutions All rights reserved
Review
• Introduction
• Challenge for Audit & Risk Assurance
• The Role of Data Analytics
• Machine Learning – The Next Generation
• Evolution
• The Future of the Audit Team?
• Q&A
45 © 2015 Consider Solutions All rights reserved
46 © 2015 Consider Solutions All rights reserved
Discussion
Dan French, Founder & CEO - Consider Solutions
Eliminating Error, Waste & Fraud - Data Science advancing World Class Finance
www.consider.biz/thinking/
@consider_ations
#worldclassfinance