Upload
mongodb
View
228
Download
8
Embed Size (px)
Citation preview
Roman Gruhn Director, Information Strategy (EMEA) [email protected]
A Modern Enterprise Architecture
Digital Platforms Have ChangedThe platforms your end users and customers use to engage with your applications and services have fundamentally changed at an unprecedented speed over the past 5 years.
UPFRONT SUBSCRIBE Business
YEARS / MONTHS WEEKS / DAYS Applications
PC MOBILE / BYOD Customers
ADS SOCIAL Engagement
SERVERS CLOUD Infrastructure
Goals of Digital Transformation
1. Unlocking operational intelligence
2. Enhancing business agility
3. Improving customer-centricity
Source https://451research.com/report-short?entityId=90066 http://www.slideshare.net/JakeHird/101-digital-transformation-statistics-2016
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
Boosting bottom line in 5 years
Competing in new segment in 3 years
Disadvanted by lack of transformation
Actively digitizing business
Challenges of Digital Transformation
Existing Systems Overwhelmed
Growth in Siloed Data
Lack Real-Time Insight
Data Warehouse Challenges
“Of Gartner's "3Vs" of big data (volume, velocity, variety), the variety of data sources is seen by our clients as both the greatest challenge and
the greatest opportunity.”*
Data VarietyDiverse, streaming or new data types
Data VolumeGreater than 100TB
Other DataLess than 100TB
* From Big Data Executive Summary of 50+ execs from F100, gov orgs; 2014
TRADITIONAL MODERNIZED
APPS On-Premise, Monoliths SaaS, Microservices
DATABASE Relational (Oracle) Non-Relational (MongoDB)
EDW Teradata, Oracle, etc. Hadoop
COMPUTE Scale-Up Server Containers / Commodity Server / Cloud
STORAGE SAN Local Storage & Data Lakes
NETWORK Routers and Switches Software-Defined Networks
The New Enterprise Stack
Data as a Cross-Enterprise Asset
1. Re-use data to power multiple apps
2. Enrich, analyze & monetize the data
3. Enforce privacy and governance
Data Pipeline Ingest & Store Query & Transform Aggregate & Share Analyze
3 Patterns to Turn Data into a Cross-Enterprise Asset
Single View
Data-as-a-Service
Operationalized Data Lake
Single View• Efficiently retrieve status of any
business entity in real time • Foundation for analytics: i.e. cross-
sell, upsell, churn risk • REQUIREMENTS:
– Flexible schema + data governance
– Rich query, aggregation, search & reporting
– Highly scalable & continuously available
Solution: Aggregate with a Dynamic Schema
… MobileApp
Web
Call
Centre CRM SocialFeed
COMMONFIELDSCustomerID|Ac/vityID|Type…
DYNAMICFIELDSCanvaryfromrecordtorecord
Single View
High Level Data FlowSource: Web App
Source: CRM App
Source: Mainframe
System
Batch or real-time
Documents/Objects
Customer Service App
Churn Analytics
Risk Model
Real-Time Access
Update Queue
… Group
Filter Sort Count Average Deviations
Valid
atio
n
Single View of Customer Insurance leader generates coveted single view of customers in 90 days – “The Wall”
Problem Why MongoDB Results Problem Solution Results
No single view of customer, leading to poor customer experience and churn 145 years of policy data, 70+ systems, 24 800 numbers, 15+ front-end apps that are not integrated Spent 2 years, $25M trying build single view with Oracle – failed
Built “The Wall,” pulling in disparate data and serving single view to customer service reps in real time Flexible data model to aggregate disparate data into single data store Expressive query language and secondary indexes to serve any field in real time
Prototyped in 2 weeks Deployed to production in 90 days Decreased churn and improved ability to upsell/cross-sell
Data-as-a-Service: Drivers
1 Development agility 2 Data re-use
3 Operational efficiency
4 Corporate governance
5 Cost accountability
DaaS Architecture
API Access Layer
Operational Data
Customers Products
Accounts Transactions
Infrastructure
App1 App2 App3 • Shared, multi-tenant database accessible via a common API
• Exposes CRUD, search, geospatial, graph, analytics
• Each data domain isolated into its own collection
• Access privileges and views defined for each collection
• Self-service provisioning, scaling on-demand
Square Enix: DaaS
• Multi-tenant OnLine Suite
• DaaS to studios & developers, exposed as an API
• On-Prem Private Cloud: Manages data shared by all titles
• Player profiles • Credits • Leaderboards • Competitions • Catalog • Cross-platform messaging
API Access Layer
MongoDB Shared Data Service
On-Prem Infrastructure (Private Cloud)
• In-App functionality provisioned to private clusters on AWS
• Game state • Player metrics • Game-specific
content & features
• Elastically scalable
Data Lake
• Centralized repository for analytics against data collected from operational systems
• Extension of EDW: often based on Hadoop
• 50% of organizations invested in data lakes*
* Gartner
http://www.infoworld.com/article/2980316/big-data/why-your-big-data-strategy-is-a-bust.html
“Thru 2018, 70 percent of Hadoop deployments will not meet cost savings and revenue generation objectives due to skills and integration challenges.” Nick Heudecker, Research Director, Data Management & Integration
Mes
sage
Que
ue
Customer Data Mgmt Mobile App IoT App Live Dashboards
Raw Data
Processed Events
Distributed Processing Framework
s
Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations
Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model
Sensors
User Data
Clickstreams
Logs
Churn Analysis
Enriched Customer Profiles
Risk Modeling
Predictive Analytics
Real-Time Access
Batch Processing, Batch Views
Design Pattern: Operationalized Data Lake
Mes
sage
Que
ue
Customer Data Mgmt Mobile App IoT App Live Dashboards
Raw Data
Processed Events
Distributed Processing Framework
s
Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations
Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model
Sensors
User Data
Clickstreams
Logs
Churn Analysis
Enriched Customer Profiles
Risk Modeling
Predictive Analytics
Real-Time Access
Batch Processing, Batch Views
Design Pattern: Operationalized Data Lake Configure where to land incoming data
Mes
sage
Que
ue
Customer Data Mgmt Mobile App IoT App Live Dashboards
Raw Data
Processed Events
Distributed Processing Framework
s
Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations
Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model
Sensors
User Data
Clickstreams
Logs
Churn Analysis
Enriched Customer Profiles
Risk Modeling
Predictive Analytics
Real-Time Access
Batch Processing, Batch Views
Design Pattern: Operationalized Data Lake
Raw data processed to generate analytics models
Mes
sage
Que
ue
Customer Data Mgmt Mobile App IoT App Live Dashboards
Raw Data
Processed Events
Distributed Processing Framework
s
Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations
Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model
Sensors
User Data
Clickstreams
Logs
Churn Analysis
Enriched Customer Profiles
Risk Modeling
Predictive Analytics
Real-Time Access
Batch Processing, Batch Views
Design Pattern: Operationalized Data Lake MongoDB exposes analytics models to operational apps. Handles real time
updates
Mes
sage
Que
ue
Customer Data Mgmt Mobile App IoT App Live Dashboards
Raw Data
Processed Events
Distributed Processing Framework
s
Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations
Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model
Sensors
User Data
Clickstreams
Logs
Churn Analysis
Enriched Customer Profiles
Risk Modeling
Predictive Analytics
Real-Time Access
Batch Processing, Batch Views
Design Pattern: Operationalized Data Lake
Compute new models against
MongoDB & HDFS
Operational Database Requirements
1 “Smart” integration with the data lake 2 Powerful real-time analytics
3 Flexible, governed data model
4 Scale with the data lake
5 Sophisticated management & security
Problem Why MongoDB Results Problem Solution Results
Existing EDW with nightly batch loads No real-time analytics to personalize user experience Application changes broke ETL pipeline Unable to scale as services expanded
Microservices architecture running on AWS All application events written to Kafka queue, routed to MongoDB and Hadoop Events that personalize real-time experience (ie triggering email send, additional questions, offers) written to MongoDB All event data aggregated with other data sources and analyzed in Hadoop, updated customer profiles written back to MongoDB
2x faster delivery of new services after migrating to new architecture Enabled continuous delivery: pushing new features every day Personalized user experience, plus higher uptime and scalability
UK’s Leading Price Comparison Site Out-pacing Internet search giants with continuous delivery pipeline powered by microservices & Docker running MongoDB, Kafka and Hadoop in the cloud
Patterns for Modern Data Architectures
Existing Systems Overwhelmed
Growth in Siloed Data
Lack Real-Time Insight
Single View Data-as-a-Service Operationalized Data Lake