View
29
Download
0
Category
Tags:
Preview:
DESCRIPTION
The Only Operational Database Technology for Mission-Critical Big Data Applications. Paul Preuveneers – Principal Technologist Lee Pollington – Principal Consultant. Agenda. Big Data and MarkLogic What is MarkLogic? MarkLogic in Financial Services - PowerPoint PPT Presentation
Citation preview
1This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Paul Preuveneers – Principal TechnologistLee Pollington – Principal Consultant
The Only Operational Database Technology for Mission-Critical Big Data Applications
2This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Agenda
•Big Data and MarkLogic•What is MarkLogic?•MarkLogic in Financial Services•MarkLogic Integration Points (Connectors / Toolkits)
3This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Volume
Complexity VariabilityValueVariety
Petabyte / ExabyteBillions of itemsSocial MediaMachine dataData processes producing data
10Ks of transactions per secondIn & outStreamsBulk processing
PatternsInferenceUnstructuredDisparate eventsRelationships
Varied sourcesVaried data typesChanging data typesValue from decision supportValue from operational efficiencies
Velocity
4This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Agenda
•Big Data and MarkLogic?•What is MarkLogic?•MarkLogic in Financial Services•MarkLogic Integration Points (Connectors / Toolkits)
5This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
What is MarkLogic Server?
•Special Purpose DBMS for poly-structured information, with enterprise expectations• ACID transactions• Backup, Full/Partial Replication, Distributed Txns
•Search Engine Kernel, with enterprise expectations• Full text• Faceted navigation, at massive scale • Boolean, proximity, stemming, tokenization, decompounding, case, diacritics,
language…•Application Server
• HTTP (including RESTful)• XCC Java/.NET• WebDAV
6This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
What makes MarkLogic DBMS Special?•Not Relational (RDBMS)•XML
• The Only Data Model Required• Schema Agnostic• Text a First-class Citizen among Data Types• XQuery/XSLT
•Optimized Search Engine Algorithms•Very Low DBA Overhead (0.5 FTE / 100 hosts)•5-Minute Install•5-Minute Scale-Out•Database and Search Engine are the same
7This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
What makes MarkLogic Search Special?
•Transactional: Enterprise Scale (no index latency)•Unicode (Internationalization)•Multiple Query Types
• Analytics: Aggregation, Facets & Ranges, Co-occurrence, Geospatial• Text Search: Boolean, Stemming, Word Lexicons, Dictionary & Thesauri• Alerting: Profiles, Alerts, Filters, Tipping, Selectors, “Triggers” … • Powerful Search Combination (e.g. Text + Analytics + Alerting)
•Processing Near the Data (fast search, low bandwidth)• Database and Search Engine are the same
8This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
123, 127, 129, 152, 344, 791 . . .
122, 125, 126, 129, 130, 167 . . .
123, 126, 130, 142, 143, 167 . . .
123, 130, 131, 135, 162, 177 . . .
126, 130, 167, 212, 219, 377 . . .
. . .
. . .
Document References
126, 130, 167, …
Term Term List
Range Indexes
“accelerating”
“creation”
“content”
“application”
“agility”
<article>
<article> / <title>
product: MarkLogic
Geospatial
Search: Universal Index
9This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
MarkLogic Can Scale
•Scale Up: Typically 1 TB+ XML per Server•Scale Out: Low Hundreds(++) of Servers in a Cluster•Commodity Hardware
• 2-CPU x 6-core/hyperthreaded• 32+ GB RAM• 3x disk: local mount with failover
•OS• Linux RHEL 5• Solaris 10• Windows 2003/8 (XP/Vista/7 for Dev)
10This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
E Host 1
partition1
E Host 3
D Host 4 D Host 5 D Host 6 D Host k
partition2 partition3 partitionm
E Host 2
partition4
HA&DR
AppServer
Data
Same Code-base
Shared-Nothing Cluster
11This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Agenda
•Big Data and MarkLogic•What is MarkLogic?•MarkLogic in Financial Services•MarkLogic Integration Points (Connectors / Toolkits)
12This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Financial Services Solutions
• Operational Data Store / Trade Store
Highly Transactional
• ISDA Contract Analysis (Electronic & Paper)• Document Analysis (e.g. Sales Process, Financial Directives)• Situational Awareness• Customer On-Boarding
Content Aggregation & Discovery
• Research / Policy Authoring & Distribution
Content Publishing
13This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Operational Data Store / Trade Store
- High Volume Trades (Derivatives, Equities, FX etc.) in siloes- Mostly represented in XML (e.g. FpML, FIXML)- Point-in-time queries (e.g. exposure by counterparty)- Risk Management (understand exposure, auditing)
What is it?
- High Performance with Native XML compared to RDBMS- We are a transactional DB (ACID + business continuity)- Less hardware required / commodity servers- No shredding of XML (lowers risk of corruption)- Can aggregate over multiple schemas- Easily accommodate new schemas, changes in schema
Why are we good at it?
14This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Operational Data Store / Trade Store
Example: JP Morgan Chase ODS
Live for 12+ months2.25 million OTC Derivatives (450+ million documents)Strategic platform mandated for core transaction processingShort-listed for Best Investment Banking Initiative at The Banking Technology Awards 2011Agile onboarding of new Derivatives productsHuge reduction in time to process FO XML messages20 Sybase systems replaced with 3-Node MarkLogic cluster
15This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
It's a Trade Processing Story
Started with DerivativesNatural fit with documents Complex instruments, “low volume” instruments
It’s a trade workflow engineEnterprise Service Bus / Component architectureNew products Modifications to existing productsSecurities had a new challenge for us
16This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
ISDA Contract Analysis
- Swaps / Derivatives Contracts- Risk Management (understand exposure)- Effect of Change (e.g. credit rating, termination events)
What is it?
- Contracts are combination data/text- Front-end solutions like Exari use Word for contract authoring but output structured XML- Good query functions for filtering and aggregation of exposure as well as other what-if scenarios
Why are we good at it?
- If in paper form, OCR and enrichment is required. This is hard, time-consuming and costly (up to $150 per doc for managed service)- Most contracts are in paper form (90+ percent)
Where do we need help?
17This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Document Analysis (e.g. Sales Process, Financial Directives)
- Making sense of poly-structured data (avoid BIG fines)- Extracting patterns and trends (e.g. did we say the right thing to our customer at the right time? / PPI mis-selling)- Developing value calculations in hard-to-handle formats (i.e. aggregating and unlocking the calculations in Excel)
What is it?
- Good conversion tools for PDF, MS Office etc.- Great full-text search to analyse converted documents- Inclusion of external content where applicable (RSS, Social Media, Web Sites)- Group individual Excel spreadsheets for powerful analysis
Why are we good at it?
- Enrichment often requires substantial domain expertise
Where do we need help?
18This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Situational Awareness
- Trading Decision Support- Amalgamation of internal/external poly-structured data- Heavy geospatial element- Analysis across datasets (vessels, pipes, weather, RSS)
What is it?
- Quick take-up of new sets of data- ML is good at geospatial queries- ML is good at incorporating external data (web, RSS etc.)
Why are we good at it?
19This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Situational Awareness
20This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Customer On-Boarding
- Content Aggregation from multiple CMS- KYC / Holistic view of customer (good communication)- Avoid duplication of effort (faster on-boarding)- Rapid search and retrieval
What is it?
- Feature-rich, fast search at volume- 30 Digits allows us to extract from multiple CMS- Flexible metadata-handling (dynamic facets)- Able to apply security model from underlying CMS
Why are we good at it?
- Lots of content is image-based / requires OCR and data enrichment
Where do we need help?
21This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Research / Policy Authoring & Distribution
- Template-driven authoring - Ensuring consistency, validation and component re-use- Dynamic Publishing (VISA, Morgan Stanley, Citigroup)
What is it?
- Easy template creation and maintenance- Great integration with MS Office- Componentisation and versioning easy in ML- Dynamic assembly based on role/geography etc.
Why are we good at it?
22This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Thank You – Questions?
Recommended