Data Warehouse Structures for AML Applications J ERZY K ORCZAK, Wroclaw University of Economics...

Preview:

Citation preview

Data Warehouse Structures Data Warehouse Structures for AML Applicationsfor AML Applications

JJERZYERZY K KORCZAKORCZAK,, Wroclaw University ofWroclaw University of EconomicsEconomics LSIIT, CNRS, Strasbourg, FranceLSIIT, CNRS, Strasbourg, France

BBŁAŻEJŁAŻEJ O OLESZKIEWICZLESZKIEWICZ, Wroclaw, Poland, Wroclaw, Poland

1

Money Laundering - DefinitionMoney Laundering - Definition

Money launderingMoney laundering is the practice of engaging in financial is the practice of engaging in financial transactions in order to conceal the identity, source, and/or transactions in order to conceal the identity, source, and/or destination of money, and is a main operation of the destination of money, and is a main operation of the underground economy.underground economy.

In this paper:In this paper:

iidentifdentificationication the methods the methods and technology and technology of the of the anti-anti-

money money llaundering aundering (AML) (AML) process process

introduction of SART systemintroduction of SART system

structures of data warehousestructures of data warehouse

selected problems of AML systemsselected problems of AML systems

  2/40

OutlineOutline

Problem of AML – the state of the artProblem of AML – the state of the art  

Fundamental aspects of AML system designFundamental aspects of AML system design

System for Analysis and Registration of TransactionsSystem for Analysis and Registration of Transactions

Architecture of data warehouseArchitecture of data warehouse

Case study – examples of a few selected problemsCase study – examples of a few selected problems

Conclusion and future researchConclusion and future research

3

Process of money launderingProcess of money laundering

SStagestages::

PlacementPlacement:: refers to the initial point of entry for refers to the initial point of entry for funds derived from criminal activities.funds derived from criminal activities.

LayeringLayering:: refers to the creation of complex networks refers to the creation of complex networks of transactions which attempt to obscure the link of transactions which attempt to obscure the link between the initial entry point, and the end of the between the initial entry point, and the end of the laundering cycle.laundering cycle.

IntegrationIntegration:: refers to the return of funds to the refers to the return of funds to the legitimate economy for later extraction.legitimate economy for later extraction.

  

4

5

Examples of stages of the processExamples of stages of the process

Placement Stage Layering Stage Integration Stage

Cash paid into bank (sometimes with staff complicity or mixed with proceeds of legitimate business).

Wire transfers abroad (often using shell companies or funds disguised as proceeds of legitimate business).

Resale of goods/assets.Income from property or legitimate business assets appears "clean".

Monies are placed into retail economy or are smuggled out of the country

Complex web of transfers (both domestic and international) makes tracing original source of funds virtually impossible.

 Establishment of anonymous companies

Transformation into other asset forms: travellers cheques,postal orders,etc.

Cash exported. Cash deposited in overseas banking system.

Sending of false export-import invoices overvaluing goods

6

AML software – the state-of-the-artAML software – the state-of-the-art

packages include capabilities of name analysis, rules-packages include capabilities of name analysis, rules-based systems, statistical and profiling engines, neural based systems, statistical and profiling engines, neural networks, link analysis, peer group analysis, and time networks, link analysis, peer group analysis, and time sequence matchingsequence matching

KYC solutions that offer case-based account KYC solutions that offer case-based account documentation acceptance and rectification, as well as documentation acceptance and rectification, as well as automatic risk scoring of the customer (taking account automatic risk scoring of the customer (taking account of country, business, entity, product, transaction risks)of country, business, entity, product, transaction risks)

oother elementsther elements: : portals to share knowledge and portals to share knowledge and e-e- learning for training and awareness learning for training and awareness 

7

Types of AML systemsTypes of AML systems

AAll financial institutions globally are required to monitor, investigate and ll financial institutions globally are required to monitor, investigate and report transactions of a suspicious nature to the financial intelligence unit of report transactions of a suspicious nature to the financial intelligence unit of the central bank in the respective country.the central bank in the respective country.

TTypes of software addressing AML business requirements:ypes of software addressing AML business requirements:

Currency Transaction Reporting Currency Transaction Reporting (CTR) systems, which deal with large (CTR) systems, which deal with large

cash transaction reporting requirements (1cash transaction reporting requirements (155,000 ,000 EE))

Customer Customer IIdentity dentity MManagement anagement systems which check various negative systems which check various negative

lists (such as OFAC) and represent an initial and ongoing part of Know lists (such as OFAC) and represent an initial and ongoing part of Know

YYour our CCustomer (KYC) requirementsustomer (KYC) requirements

Transaction Transaction MMonitoring onitoring SSystemsystems, which focus on identification of , which focus on identification of

suspicious patterns of transactions which may result in the filing of suspicious patterns of transactions which may result in the filing of

Suspicious Activity Reports (SARs). Identification of suspicious (as Suspicious Activity Reports (SARs). Identification of suspicious (as

opposed to normal) transactions is part of the KYC requirements.opposed to normal) transactions is part of the KYC requirements.

8

Modules of AML systemModules of AML system

SSoftware applications effectively monitor bank customer oftware applications effectively monitor bank customer transactions on a daily basis and, using customer historical transactions on a daily basis and, using customer historical information and account profile, provide a "whole picture" information and account profile, provide a "whole picture" to the bank management. to the bank management.

Each vendor's software works somewhat differentlyEach vendor's software works somewhat differently; s; some ome of the modules in an AML software are:of the modules in an AML software are:

Know Your CustomerKnow Your Customer (KYC) (KYC)

Entity ResolutionEntity Resolution

Transaction MonitoringTransaction Monitoring

Compliance ReportingCompliance Reporting

Investigation ToolsInvestigation Tools9

Transaction Monitoring SystemsTransaction Monitoring Systems

TMSTMS focus on identification of suspicious patterns of focus on identification of suspicious patterns of transactions which may result in the filing of Suspicious transactions which may result in the filing of Suspicious Activity Reports (SARs). Identification of suspicious Activity Reports (SARs). Identification of suspicious transactions is part of the KYC requirements.transactions is part of the KYC requirements.

Financial institutions face penalties for failing to properly Financial institutions face penalties for failing to properly file CTR and SAR reports, including heavy fines and file CTR and SAR reports, including heavy fines and regulatory restrictions, even to the point of charter regulatory restrictions, even to the point of charter revocation.revocation.

10

OutlineOutline

Problem of AML – the state of the artProblem of AML – the state of the art  

Fundamental aspects of AML system designFundamental aspects of AML system design

System for Analysis and Registration of TransactionsSystem for Analysis and Registration of Transactions

Architecture of data warehouseArchitecture of data warehouse

Case study – examples of a few selected problemsCase study – examples of a few selected problems

Conclusion and future researchConclusion and future research

11

Typical solutions vs. Analytical SQL ServerTypical solutions vs. Analytical SQL Server

12

DBMS SQLServer

OLAPServer

ApplicationServer

Workstation

WorkstationWorkstation

Data integration

AnalyticalServer

Workstation

WorkstationWorkstation

AnalyticalSQL

Server

Architecture of Analytical SQL ServerArchitecture of Analytical SQL Server

SQL SERVER

SQL Vector Engine

SQL ExtensionBusiness Intelligence

SQL ExtensionAnalytic Intelligence

Fact tables(clasical and vectorized)

Hierarchical Tables (ROLAP Dimensions)

ROLAP Cubes

ROLAP DataMart Cubes (Vectorized ROLAP Objects)

Linear Algebra

Statistics, Econometry

Linear Programming

Additional Analytical Extensions

13

SART internal architectureSART internal architecture

ApplicationLayer

SARTApplication

AnalyticalIntelligence

Analysis

BusinessIntelligence

OLAP Data Warehouse

Analytical SQL Server

14

Major modules of SARTMajor modules of SART

SART

Personal Data Register

Entity Data Register

Customersand Actual

Beneficiaries Register

Reporting

Data Import/Export Module

Transaction Register

Account Register

Customer Assessment

Customer Behaviour Monitoring

Transactions Register for

GIIF

Bank Transactions

Analysis Module

Risk Analysis

15

Major modules of SARTMajor modules of SART

16

SART

Reporting

Data Import/Export Module

Transaction Register

Account Register

Customer Behaviour Monitoring

Bank Transactions

Analysis Module

OutlineOutline

Problem of AML – the state of the artProblem of AML – the state of the art  

Fundamental aspects of AML system designFundamental aspects of AML system design

System for Analysis and Registration of TransactionsSystem for Analysis and Registration of Transactions

Architecture of data warehouseArchitecture of data warehouse

Case study – examples of a few selected problemsCase study – examples of a few selected problems

Conclusion and future researchConclusion and future research

17

Main issues

Problem of scalability Problem of scalability

Data structure Charts of Accounts of General Data structure Charts of Accounts of General LedgerLedger

OLAP Data Warehouse based on General LedgerOLAP Data Warehouse based on General Ledger

ReportingReporting

Transaction chainsTransaction chains

18

19

Architecture of data warehouseArchitecture of data warehouseProblem of scalabilityProblem of scalability

Size of dimensions:• General Ledger Dimension 60 000 entries, • Bank customers Dimension 500 000 entries, • Time Dimension 3600 entries (the duration of

operations 10 years), • Number of measures in OLAP cube is 5.

20

Architecture of data warehouseArchitecture of data warehouseProblem of scalabilityProblem of scalability

Approximate size of OLAP cube of 230.2 PB

Approximate calculation of the indicated OLAP cube’s size shows that it is not feasible to store OLAP data without Compression.

Approximate number of entries in OLAP cube was: 60 000 ×500 000 × 3600 × 5 = 0.54*1015 .

Considering the minimum size of data stored in OLAP cube (4 bytes dimension identifier, 8 bytes measure’s value) this value should increase by 3×4×5×8 = 480 times that is 259.2*1015 bytes

21

Heterogeneous data warehouse dimensions of General Ledger

22

Homogenous dimensions of TIME

Data Warehouse of General LedgerData Warehouse of General Ledger

Modeling and Modeling and iimplementation of the Data mplementation of the Data Warehouse of General LedgerWarehouse of General Ledger

Fact TableFact TableDimensionsDimensions

23

Data Warehouse of General LedgerData Warehouse of General Ledger

Star SchemaStar Schema

24

fact_table

PK id

FK1 id_time_dimFK3 id_entity_dimFK2 id_chart_of_accounts_dim debits_sum debits_count credits_sum credits_count all_count

time_hierarchy_dim1

PK id

year quarter month day

chart_of_accounts_dimension

PK id

account_name account_type account_code

entities_dimensions

PK id

entity_name entity_type

Data Warehouse of General LedgerData Warehouse of General Ledger

Normalized Time Dimension in a Snowflake SchemaNormalized Time Dimension in a Snowflake Schema

25

fact_table

PK id

FK1 id_time_dimFK3 id_entity_dimFK2 id_chart_of_accounts_dim debits_sum debits_count credits_sum credits_count all_count

time_dim_day

PK id

FK1 id_month day

chart_of_accounts_dimension

PK id

account_name account_type account_code

entities_dimensions

PK id

entity_name entity_type

time_dim_year

PK id

year

time_dim_quarter

PK id

FK1 id_year quarter

time_dim_month

PK id

FK1 id_quarter month

Data Warehouse of General LedgerData Warehouse of General Ledger

Hierarchical schema (always a la star schema)Hierarchical schema (always a la star schema)

26

fact_table

PK id

FK1 id_time_dimFK3 id_entity_dimFK2 id_chart_of_accounts_dim debits_sum debits_count credits_sum credits_count all_count

time_hierarchy_dim

PK id

FK1 id_node node_class node_value

chart_of_accounts_dimension

PK id

account_name account_type account_code

entities_dimensions

PK id

entity_name entity_type

Data Warehouse – Fact TableData Warehouse – Fact Table

27

Facts Table – operationFacts Table – operation

28

Facts Table – operationFacts Table – operation

29

Facts Table – operationFacts Table – operation

30

Facts Table – operationFacts Table – operation

31

Facts Table and Charts of AccountFacts Table and Charts of Account

32

33

Integration of accounting model and transaction mode

Summary of data structureSummary of data structure

Technological characteristics:Technological characteristics:

non uniform hierarchynon uniform hierarchy

number of nodes: 61 297number of nodes: 61 297

within number of synthetic accounts: 29 268within number of synthetic accounts: 29 268

max depth: 10max depth: 10

Application characteristics:Application characteristics:

decrees dictionary of General Ledgerdecrees dictionary of General Ledger

dictionary of transaction accountsdictionary of transaction accounts

dimensions of data warehousedimensions of data warehouse

34

OutlineOutline

Problem of AML – the state of the artProblem of AML – the state of the art  

Fundamental aspects of AML system designFundamental aspects of AML system design

System for Analysis and Registration of TransactionsSystem for Analysis and Registration of Transactions

Architecture of data warehouseArchitecture of data warehouse

Case study – examples of a few selected problemsCase study – examples of a few selected problems

Conclusion and future researchConclusion and future research

35

Case Study – some statisticsCase Study – some statistics

sample databasesample databasenumber of processed records (daily):number of processed records (daily):

min: ~1,000 rec. (weekend)min: ~1,000 rec. (weekend)max: ~300,000 rec. (end of month)max: ~300,000 rec. (end of month)

monthly (January 2008)monthly (January 2008)total: 2,497,280 rec./monthtotal: 2,497,280 rec./monthdaily average : 80,557 rec.daily average : 80,557 rec.DW dimensions: 197,046 rec.DW dimensions: 197,046 rec.

36

Characteristics Characteristics of of DW (General Ledger)DW (General Ledger)

13,299,773 rows in Facts Table13,299,773 rows in Facts Table

20,581,733 Cartesian products in OLAP Cube20,581,733 Cartesian products in OLAP Cube

970,987,198 number of OLAP operations 970,987,198 number of OLAP operations executed during recomputing of OLAP cube executed during recomputing of OLAP cube

970.987.198 / 20.581.733 = 47,177 average 970.987.198 / 20.581.733 = 47,177 average number of OLAP operations over registered number of OLAP operations over registered decreedecree

37

OLAP Data Warehouse of General LedgerOLAP Data Warehouse of General Ledger

Implementation of the OLAP DW of General LedgerImplementation of the OLAP DW of General Ledger

Fact TableFact TableDimensionsDimensionsOLAP CubeOLAP CubeCube Pivot as changes viewing in OLAP cubeCube Pivot as changes viewing in OLAP cube

38

OLAP Data Warehouse OLAP Data Warehouse – – General LedgerGeneral Ledger

39

OLAP Data Warehouse OLAP Data Warehouse – – General LedgerGeneral Ledger

40

OLAP Data Warehouse OLAP Data Warehouse – – OLAP Raport with OLAP Raport with view on the Charts of Account dimensionview on the Charts of Account dimension

41

OLAP Data Warehouse OLAP Data Warehouse – – OLAP Raport with OLAP Raport with view on the Charts of Account dimensionview on the Charts of Account dimension

42

Data Warehouse – Cube PivotData Warehouse – Cube Pivot

43

OLAP operation

DimensionGeneral Ledger

DimensionClient

Dimension General Ledger

Dimen.Time

Dimension Time

DimensionClient

Cash in ZLP

Cash in ZLP

Account 1

Account 1 Account N

Account N

OLAP Data Warehouse OLAP Data Warehouse – – OLAP Raport with OLAP Raport with view on the Charts of Account dimensionview on the Charts of Account dimension

44

OLAP Data Warehouse OLAP Data Warehouse – – OLAP Raport with OLAP Raport with view on the Charts of Account dimensionview on the Charts of Account dimension

45

OLAP Data Warehouse OLAP Data Warehouse – – General LedgerGeneral Ledger

46

PerformancePerformance of of DW DW SART OLAPSART OLAP

Data import: 57,952 decreesData import: 57,952 decrees

OLAP recalculation:OLAP recalculation:OLAP calculation: 4 min. 10 sec (250 sec.)OLAP calculation: 4 min. 10 sec (250 sec.)OLAP operations: 5,275,770OLAP operations: 5,275,770number of created Cartesian products OLAP: number of created Cartesian products OLAP: 991,675991,675average number of OLAP operations/sec: average number of OLAP operations/sec: 21,103.1 op./sec.21,103.1 op./sec.

47

PerformancePerformance of of DW DW SART OLAPSART OLAP

OLAP reporting performance:OLAP reporting performance:

Chart of Accounts dimension OLAP view:Chart of Accounts dimension OLAP view: Maximum: ~2 sec.Maximum: ~2 sec. Average: ~0.6 sec.Average: ~0.6 sec.

Time dimension OLAP view:Time dimension OLAP view: Maximum: ~1.2 sec.Maximum: ~1.2 sec. Average: ~0.4 sec.Average: ~0.4 sec.

48

Summary of Summary of DW DW architecture the SART OLAParchitecture the SART OLAP

Data Warehouse model based on non-uniform Data Warehouse model based on non-uniform dimensionsdimensions

OLAP model based on non-uniform dimensionsOLAP model based on non-uniform dimensions

Cube Pivot operation with slice functionalityCube Pivot operation with slice functionality

49

SART – Transaction mergingSART – Transaction merging

Transaction merging processTransaction merging process in SART in SART

BuildBuildinging a a transaction model based on the transaction model based on the General Ledger decreesGeneral Ledger decrees

IntegrationIntegration of of the the transaction model with transaction model with the the General Ledger accounting modelGeneral Ledger accounting model

Integration Integration of the of the transaction model with transaction model with a a OLAP reportingOLAP reporting

50

SART – Transaction mergingSART – Transaction merging

51

SART – Transaction mergingSART – Transaction merging

52

SART – Transaction mergingSART – Transaction merging

53

SART - Cash Flow Chains AnalysisSART - Cash Flow Chains Analysis

Cash Flow Chains Analysis (CFCA)Cash Flow Chains Analysis (CFCA)

Cash Flow ChainsCash Flow Chains

OLAP Data Warehouse of Cash Flow ChainsOLAP Data Warehouse of Cash Flow Chains

Cash Flow Chains Analysis – example of useCash Flow Chains Analysis – example of use

54

SART - Cash Flow Chains AnalysisSART - Cash Flow Chains Analysis

55

K1

K2

K3

K4

k14

k15

k16

k17

K8

K6

k10

k12

K5

K7

k11

k13

K9

T8

T9

T6

T5

T1

T18

T20

T19

T14

T16

T2

T11

T7

T12

T3

T17

Source accounts End accountsIntermediate accounts

T10

T15

Cash Flow Chains AnalysisCash Flow Chains Analysis

56

K1

K2

K3

K4

k14

k15

k16

k17

K8K6

k10

k12

K5

K7

k11

k13

K9

T8

T9

T6

T5

T1

T18

T20

T19T14

T16

T2

T11

T7

T12T3

T17

T10

T15

K1

K2

K3

K4

k14

k15

k16

k17

K8K6

k10

k12

K5

K7

k11

k13

K9

T8

T9

T6

T5

T1

T18T20

T19

T14

T16

T2

T11

T7

T12T3 T17

T10

T15

K1

K2

K3

K4

k14

k15

k16

k17

K8K6

k10

k12

K5

K7

k11

k13

K9

T8

T9

T6

T5

T1

T18

T20

T19T14

T16

T2

T11

T7

T12T3T17

T10

T15

K1

K2

K3

K4

k14

k15

k16

k17

K8K6

k10

k12

K5

K7

k11

k13

K9

T8

T9

T6

T5

T1

T18T20

T19

T14

T16

T2

T11

T7

T12

T3

T17

T10

T15

K1

K2

K3

K4

k14

k15

k16

k17

K8K6

k10

k12

K5

K7

k11

k13

K9

T8

T9

T6

T5

T1

T18T20

T19

T14

T16

T2

T11

T7

T12

T3

T17

T10

T15K1

K2

K3

K4

k14

k15

k16

k17

K8K6

k10

k12

K5

K7

k11

k13

K9

T8

T9

T6

T5

T1

T18T20

T19T14

T16

T2

T11

T7

T12T3

T17

T10

T15

Transaction chains – trees viewTransaction chains – trees view

57

Transaction chains – trees viewTransaction chains – trees view

58

Transaction chains (cash flow)Transaction chains (cash flow)

22 921 13 359

2 738

22 921 13 359

26 830

5 4

44

2 738

22 921 13 359

26 830

5 4

44

2 738

6 9

59

22 921 13 359

26 830

5 4

44

2 738

6 9

59

26 010

16 704

22 921 13 359

26 830

5 4

44

2 738

6 9

59

26 010

16 704

14 037

18 214

18 215

59

SART - Cash Flow Chains AnalysisSART - Cash Flow Chains Analysis

Four major CFCA ratesFour major CFCA ratesSource Accounts/Transaction chains ratioSource Accounts/Transaction chains ratioDestination Accounts/Transaction chains ratioDestination Accounts/Transaction chains ratioInner Accounts/Transaction chains ratioInner Accounts/Transaction chains ratioNumber of account cycle chainsNumber of account cycle chains

60

CASE STUDYCASE STUDYSART - Cash Flow Chains AnalysisSART - Cash Flow Chains Analysis

Sample database of OLAP Data Warehouse Sample database of OLAP Data Warehouse CFCACFCA

Number of transactions : Number of transactions : 4646,,459459

Number of accounts in CFCA: Number of accounts in CFCA: 3838,,844844

Number of chains: Number of chains: 55,,021021,,459459

Number of chains links: Number of chains links: 2929,,567567,,581581

61

CASE STUDYCASE STUDY

Analysis of Transaction ChainsAnalysis of Transaction Chains

In the case study it will be analyzed In the case study it will be analyzed transaction chain from the source account transaction chain from the source account (id=(id=2292122921) to the target account ) to the target account (id=(id=14037).14037).

CASE STUDYCASE STUDY

Transaction Chain AnalysisTransaction Chain Analysis

Report characteristics:Report characteristics:• # of generated chains 674;# of generated chains 674;• # of transactions participating in chains 88;# of transactions participating in chains 88;• # of source accounts of sub-chains 5;# of source accounts of sub-chains 5;• # of target accounts of sub-chains 5;# of target accounts of sub-chains 5;

Important risk of ML using „shell companies”.Important risk of ML using „shell companies”.

Cash Flow Chains Cash Flow Chains Analysis Analysis

Wybrane konto źródłoweWybrane konto źródłowe

Cash Flow Chains Cash Flow Chains Analysis Analysis

Wybrane konto źródłoweWybrane konto źródłowe i docelowe (wynik zapytania) i docelowe (wynik zapytania)

Cash Flow Chains Cash Flow Chains Analysis Analysis

Wybrane konto źródłoweWybrane konto źródłowe i docelowe (wynik zapytania) i docelowe (wynik zapytania)

Cash Flow Chains Cash Flow Chains Analysis Analysis

Wybrane konto źródłoweWybrane konto źródłowe i docelowe (wynik zapytania) i docelowe (wynik zapytania)

Cash Flow Chains Cash Flow Chains Analysis Analysis

Wybrane konto źródłoweWybrane konto źródłowe i docelowe (wynik zapytania) i docelowe (wynik zapytania)

Cash Flow Chains Cash Flow Chains Analysis Analysis

Wybrane konto źródłoweWybrane konto źródłowe i docelowe (wynik zapytania) i docelowe (wynik zapytania)

Cash Flow Chains Cash Flow Chains Analysis Analysis

Wybrane konto źródłoweWybrane konto źródłowe i docelowe (wynik zapytania) i docelowe (wynik zapytania)

Cash Flow Chains Cash Flow Chains Analysis Analysis

Wybrane konto źródłoweWybrane konto źródłowe i docelowe (wynik zapytania) i docelowe (wynik zapytania)

Cash Flow Chains Cash Flow Chains Analysis Analysis

Wybrane konto źródłoweWybrane konto źródłowe i docelowe (wynik zapytania) i docelowe (wynik zapytania)

Conclusions and future worksConclusions and future works

SART has been implemented SART does not need any suppl. components high system performance -> ~ real time extensions: credit analysis, operational risk,

Further research: standardisation of the data warehouse model development of BI and CI mapping Object/Relation model in SART study of data mining algorithms

73

Recommended