Data Concierge –
The Foundation of a Digital Business
Thomas Dornis
San Francisco, CA – December 7th 2017
#CWIN17
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 2
Table of Contents
Introduction/ Insights Drive Organization
Data Concierge Approach
Balancing Business Value with Industrialized
Capabilities
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 3
Becoming an Insight Driven Organization – Intelligent Enterprise
A fact-driven, highly industrialized quick-scan of your insights & data portfolio, giving you all you need to
make informed, value-driven decisions about the next steps of your insight-driven journey.
A high-speed, metrics-driven 6-12 weeks intervention that delivers
a business case, to-be design and roadmap for key portfolio
areas:
BI Modernization
Business Insights Service Center/ Data Concierge
Big Data/ Data Science
Cognitive & AI
Sector and Domain Analytics
What We Do
The Value
Client sees many opportunities to leverage insights & data, for
example around Big Data, Modernizing the BI Landscape, creating
advanced analytics and exploring Cognitive & AI
Client has trouble in making decisions, being hesitant about the
financial impact, the technology choices, the to-be design and
most feasible next steps to make
Challenges & Opportunities
Multiple years of tried & tested approach, delivering solid results in
a short timeframe, facilitating decision-making
Dedicated and specialized Center of Excellence in Bangalore
Collaborative process, involving all key stakeholders from day 1
Tools-supported, compelling visual report outs
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 4
Key transformation challenges for large scale data & analytics platforms
End user profiles/ requirements vary widely across the organization
Demand for new data assets become intolerable in a classic governance set up
User stories backlogs are getting longer and longer even when employing Agile methods
Deploying new services is taking too long
New tools & analytical techniques proliferate
DATA CONCIERGE
Industrialize and automatize data provisioning processes as much as possible
Provide a simple, business-oriented information catalog of all data assets available
Provide a simple and managed way for business users to go “self service” where possible
Use intelligent processes for proactive optimization & recommendations
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 5
Compressing the time to value, standardizing the cost to insight
Business Information Catalog Services
Repository, search and recommendation services for business meta-data
Data Operations Services
On-going management and support of the data assets including optimization, quality and governance
Ingestion Services
Loading data in appropriate perimeter with corresponding SLA and on-demand / self-service features for the business
Distillation Services
Structuring and providing the business with the information they need in the right view
Data Science and Analytics Services
A bespoke service for data science & analytics with multiple insights delivery models
Use Case Catalog Services
Repository of solutions/ use cases that have been implemented with business value and impact to bottlers
“Art of the Possible”
Industrialized
Automatized
Intelligent
Data Concierge Framework
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 6
Data Operations
Services
Data Concierge: Business Information Catalog Services
Business description of datasets, structures & services
Through a web-based portal, users can search for data assets within the lake, and use
recommendations provided by the tool; shopping cart approach for data assets
Communicate data assets characteristics
Ownership – IT & Business champions in charge of the data assets & contact info
Perimeter & SLAs – industrial/certified, experiment, self-service (shareable)
Access – which user population have access to this data asset, type of access
Current status of accessibility/usability within the lake
The data lake governance instances periodically review and curate the additions & modifications
made to the catalog
Curation of data assets
Governance of self service & experiment perimeters – discard data assets when initiatives are finalized
Major communication tool to support user adoption
Ingestion
Services
Distillation
Services
Data Science
& Analytics
Services
Business
Information
Catalog Services
Use Case
Catalog Services
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 7
Use Case
Catalog Services
Data Concierge: Data Operations (Governance) Services s
Intelligence behind the Data Concierge, masking complexity to the user
• Logs data assets search requests by user population
• Builds recommendations for data assets searched and used by similar populations of users
• Manage access rules between stakeholders (sharing of information, use etc.)
• Logs access to datasets, structures & services, distillation & transformation processes
• Detects anomalous search & access behaviors
• Apply quality & governance rules for data assets, structures & services
• Publish data assets and structures to perimeter, deploys services
• Enable & publish data lineage for each service
• Propose schema/structure optimization within a Data Hub/ Business Data Lake structure
• Cluster performance monitoring and optimization recommendations
Ingestion
Services
Distillation
Services
Data Science
& Analytics
Services
Business
Information
Catalog Services
Data Operations/
Services
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 8
Use Case
Catalog Services
Business
Information
Catalog Services
Data Operations
Services
Data Concierge: Ingestion Services
“Industrialized” Ingestion Services
Support multiple modes of ingestion:
• Real-time streaming
• Micro-batch
• Batch
• Replication
Govern choice of software vendors: Data Integration Platform, API
Services, Open Source tools
Service Ticket driven approach
Users request a system and data asset that they didn’t find in the catalog
They receive a defined response time for loading
Stage Key Data Sources
80/20 Rule: 80% of analytics requirements are driven by 20% of data
Stage common/ most used data sources to “seed” the data lake
Ingestion
Services
Distillation
Services
Data Science
& Analytics
Services
Ing
es
tion
Se
rvic
es
Events
Web API
File
App API
Adaptor
RDBMS
Stre
am
ing
Ba
tch
SO
A/E
AI
CD
C
Te
ch
nic
al M
eta
Data
Au
dit
Da
ta L
ak
e
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 9
Use Case
Catalog Services
Ingestion
Services
Business
Information
Catalog Services
Data Operations
Services
Data Concierge: Distillation Services
Conversion of RAW data into usable data
Users request data assets to be converted into SQL data
stores with the right view for their needs
Users can also request Sandboxes for investigation and
analytics
Users can also request Excel and interactive reports
dashboards
Users receive a defined response time for deployment
Incorporate MDM and X-Ref data to create single
views of given domains (re-usable components)
Aggregate massive data sets down to manageable
results volumes
Includes the provisioning, and de-provisioning of
distillations on an automatic or scheduled basis
Distillation
Services
Data Science
& Analytics
Services
Master Data &
X-refTransformation
Aggregation
Data Lake
Distillation Layer
Usage Layer
Extraction
SQL SandboxSQL Excel
Provisioning
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 10
Ingestion
Services
Distillation
Services
Business
Information
Catalog Services
Data Operations
Services
Data Concierge: Data Science & Analytics Services
Question-based Data Science
“Can we improve the forecast for next years sales? (12+ month view not in SAP/ APO?)”
“What are the primary drivers for missed deliveries?”
“Can we predict factory downtime?”
Multiple delivery modes available depending on complexity and business users’ autonomy
Users can request to be fully autonomous, or work in integrated team, or request a fully delivered service
For an integrated team or fully delivered service, a CoE Data Science team can work collaboratively to define the problem space, data requirements and definition of the outcome required
A fixed price is then provided for the ‘proof of value’
Once the model has been proven it can then be industrialized via the Ingestion and Distillation Services
The different delivery modes can be used as a framework to progressively ramp up the end users on the new system
Enabling new data usages and new tools:
• Initial use cases are delivered in integrated mode; formal delivery by CoE or similar structure
• Over time, stakeholder may deliver their own use cases
Use of data lab approach and exploration to help users get familiar with the data assets available and the functions of the new system vs. legacy
Data Science
& Analytics
Services
Use Case
Catalog Services
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 11
Data Concierge – Use Case Catalog
Ingestion
Services
Distillation
Services
Business
Information
Catalog Services
Data Operations
Services
Data Science
& Analytics
Services
Use Case
Catalog Services
Existing Use Cases New Use Cases+
Improve
Insights Catalog
Industrialize
PoC
Filter and Eliminate
Scale & Communicate
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 12
Data Concierge – Use Case Catalog
Advanced Analytics/ Data Science Example (CPG)
Connected Assets & Service
Recommendation
• Analyze customer usage
pattern
• Recommend optimal services
offer
• Real-time Asset control
Product Design Analytics
• System Reliability Modeling
• Intelligent target setting &
allocation
• Cost & Weight Analytics
• Approximate models using
CAE/Test Data
• Product BenchmarkingEn
gin
eeri
ng
Dir
ecto
r
Supplier Risk Analytics
• Supplier quality analysis & Risk
Driver Identification
• Quality & Risk Scoring
• Rationalization & optimal
selection
En
gin
eeri
ng
Dir
ecto
r
Factory Analytics
• Machine Performance
& Control
• Energy Consumption Analysis
• Predictive Machine
Maintenance
• Stochastic demand & supply
planning
Head
-M
an
ufa
ctu
rin
g
Asset Performance & Control
• Performance Analysis
• Segmentation
• Adaptive Control Limits
• Real-time monitoring
Head
-O
pera
tio
ns
Predictive Maintenance
• Correlation of events/usage
to failures
• Root cause analysis & driver
identification
• Failure Prediction &
Recommendation Explore
failure anomalies
Head
-O
pera
tio
ns
CM
O
Advanced Planning &
Scheduling
• Analyze disruptions &
operational impact
• Stochastic demand & supply
planning of resources (e.g.
assets, manpower, services,
etc.)
Head
-O
pera
tio
ns
Service/ Issue Analytics
• Financial budgeting & reserving
• Claims & supply side
optimization
• Recall & product improvement
modeling
• Coverage & pricing strategy
formulation
CM
O &
CF
O
Service Optimization
• Dealer/service provider
performance analysis
• Predict the profitable customers
who may sign/renew services
contracts
• Service price optimizationC
MO
• Data Science
“Applications” to
address specific use
cases
• Contributions by entire
system:
• Business Units
• Corporate
• CoE
• Vendors/ Integrators
• Shared Service manages
the industrialization/
scaling
• Apply when ready –
driven by stakeholder
maturity
Development Approach:
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 13
Balancing Business Value & Industrialized Capabilities
End State
Quick Wins
Ideal
Architecture
Corridor
of balance
Bu
sin
es
s S
erv
ice
Va
lue
Capability
Value
Score-
carding Catalog
of Services
Training
Organisation
and Roles
rou
te r
eq
uest
Customer
Business Partner
Engagement Support
Engagement
Core
Advanced
Custom
Interpretation
Internal
delivery
Walk up or Ad-
Hoc
Engagement
Walk up
Template Capability Centre
Person/screen
Capability Centre or Cluster
Capability Centre or Cluster
Brief
Demand Management Supply Management
Brief
EngagementEngagement
Projects EngagementProject Plan
Data &
Technology
Allocation
Resource
Allocation
Governance
Delivery
Prioritisation
En
ter
into
dem
an
d b
acklo
g
Service
SelectionBrief Type Delivered By
Business Question
DefinedDelivery Plan
Delivery Processes
SUBSCRIPTION
PAY-AS-YOU-GO
PAY FOR FLEX
RESOURCES
Funding
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 14
Thank You!
Phone: +1 (520) 661-7333
Thomas Dornis
NA Leader – Information Strategy
Insights & Data
Speaker 1
Photo
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 15
Appendix
CWIN San Francisco 2017 – Data Concierge| December 7th 2017
Copyright © 2017 Capgemini. All rights reserved. 16
Data Concierge: Services Mapped to the Data Hub/ Business Data Lake
Architecture
Data Lake
Distillation Layer
Usage Layer
ODS
Applications
Analytics & Data Science
Industrial, certified
Perimeter
Experiment
Perimeter
Self service
Perimeter
Business Information Catalog
Op
era
tion
s
MDM Transformation Aggregation Transformation
Aggregation
Transformation
Aggregation
Governance
Go
ve
rna
nce
Co
rpo
rate
vie
w
Lo
ca
l
vie
w
.. Sa
nd
box
sp
ace
N
Sa
nd
bo
x
sp
ace
1
.. Sa
nd
box
sp
ace
N
Sa
nd
bo
x
sp
ace
1
Sources
Ingestion Services
Distillation Services
Data Science
& Analytics Services
Business Information
Catalog Services
Data Operations
Services
Data domainsData domains
Data domainsData domains
Data domainsData domainsData domains
Use Case
Catalog Services