If you can't read please download the document
Upload
trinhnguyet
View
221
Download
2
Embed Size (px)
Citation preview
Big Data Analytics @ Munich ReSAS Global Forum Executive Program - Orlando
Wolfgang Hauner Marc Wewers
Chief Data Officer, Munich Re IT Architect, Munich Re
Center for International Earth Science Information Network - CIESIN - Columbia University. 2016. Gridded Population of the World,
Version 4 (GPWv4): Population Count. Palisades, NY: NASA Socioeconomic Data and Applications Center (SEDAC).
http://dx.doi.org/10.7927/H4X63JVC.
http://dx.doi.org/10.7927/H4X63JVC
Agenda
2
Data Analytics Framework
1Technology
2
Method Example: AI
3Case Study: Cross Selling
4
Loc-based services
Smart Home
Telematics
Virtual
Assistant
Systems
Haptic
Technologies
Integrated
Systems
Autonomous Systems
and Devices
Automated
Decision
TakingCloud/Client
ArchitectureNew Payment
Models
Big Data
Internet
of Things
Cybersecurity
Digitalization
Computing
Everywhere
Robotics/Drones
Wearable Devices
Risk-based
Security
Context-aware
Computing
Open Data
Collaborative
Consumption
Predictive
Analytics
Industrialization 4.0
Web 4.0
Web-Scale IT
Software-defined
Anything
Crowdsourcing
Mobile Health
Services
3D Printing
Augmented and virtual worlds
Citizen Development
User Centered Design
Digital
Identity
On-Demand-Everything
Big Data in Trend Radar
3
Big Data
Digitization
Internet of Things
When does it become BIG Data
40,000,000,000,000,000,000,000
ByteKilobyteMegabyteGigabyteTerabytePetabyteExabyteZettabyte
Source: IBM
4 KB Commodore
VC 203.5 inch
floppy disk
4 TB in Memory
Big Data Platform
MR
Petabyte storage
big data platform
Google,
Facebook,
Microsoft,
All words ever
spoken by
humans
Yes or No
4
43 zettabytes of data will probably be generated by 2020
300 times the volume in 2005
Data contained
in a library floor
Big Data
Analytics
Methods
Regression Models
Machine Learning
Models
Text Mining
Technology
Hardware
(Compute power)
Software
(SAS, R, Spark, )
Data
Internal Data
External Data
Structured Data
Unstructured Data
People
Data Scientists
Data Engineers
Business People
IT Architects
Big Data Analytics is a Combination of Methods,
Technology, Data and People
5
Building the Team, and the Environment
Programming
Story-telling
Statistics
Visualization
System
Implemen-
tation
DB Administration
Maths
Modelling Data Storage
Business-/
Domain
knowledge
6
Business-
UnitsIT
Building the infrastructure
BI Lab Production
7
Data Lake (HDFS)
Long term unstructured and structured data
Industry icon, tool box and image: used under license from shutterstock.com
Other icons: Munich Re
SASHANA Hadoop Stack
HANA
User InterfaceUser Interface User Interface
SASHANA Hadoop Stack
HANA
User InterfaceUser Interface User Interface
A2P
Wolfgang Hauner / Big Data Analytics & Artificial Intelligence 8Source: Munich Re
Which topics drive our clients?
Up-/Cross-Selling
Data
Sources
Textmining Churn
Analysis
Supply
Chain
Social Media
Analysis
Fraud
Detection
Big Data
Technology
Predictive
UW
Telematics
Sensor
Data/IoT
Geospatial
9
Big Data use cases in insurance
Make the uninsurable insurable
Diabetics
Wind Energy
Consolidate the information and process
Automated underwriting
Risk management platform
Artificial Intelligence supported workflow
Early Loss Detection
Visual Loss Adjustment
10
Image: dpa Picture Alliance Image: Getty Images
Image: Getty Images
Image: used under license from shutterstock.com
Image: used under license from shutterstock.com
Image: used under license from shutterstock.com
Agenda
11
Data Analytics Framework
1Technology
2
Method Example: AI
3Case Study: Cross Selling
4
Design principles for Big Data & Analytics Platform
12
SAS & Hortonworks Self-Service Multi Tenancy
One Central Datalake
DevOps On-Prem & CloudContinious improvement
Automation
Hybrid
Roadmap to Production via Lab environment
Q2 - 2015 Q3 - 2015 Q4 - 2015
Setup of new
BI-Lab Hadoop Cluster
On-boarding & support of
Big Data & Analytics pilots
Stabilization of BI-Lab Hadoop cluster
Authentication & Security
AutomationNew BI-Lab Hadoop Cluster available
Large shared cluster
Dedicated clusters
Single-Node cluster
Pilot SAS Hadoop Integration
13
Design Setup / Build Run
Setup of first
BI-Lab Hadoop
Cluster
Enhance / Optimize
Enhance / Optimize
Building the Big Data & Analytics Platform
Production Environment
14
Design Setup / Build Run
2016
Release v1.0 Release v2.0 Release v3.0
SAS 9.4 M3
SAS Visual Analytics (VA)
Self-Service Data Upload
SAS Embedded Process
for Hadoop
SAS Enterprise Guide (EG)
SAS MS Office Add-in
Data Access to SAP HANA,
Oracle & MS SQL-Server
SAS Enterprise Miner
SAS Contextual Analysis
SAS Mobile BI iOS App
HDP 2.3
Hue
Hive
Ambari
Ranger with LDAP
Sqoop
Pig
Spark 1.4
Oozie
HDP 2.4.2
Ambari Views
Spark 1.6
Solr Cloud
Tesseract
Start setup platform Release v4.0
01 02 03 04 05 06 07 0812 09 10 11 122017
SAS VA Row Level Security
SAS HA
HDP 2.5
Atlas
Zeppelin
Data Catalogue Tool
Data Lineage
Compliance & Security
Optimize
2-week iterations withRolling Upgrades
Enhance / Optimize
Enhance
Big Data & Analytics Production Environments
IT and Business Deployment
Sandbox
Integration
Production
Self-Service
Ad-hoc Analytics
Scheduled
Analytics
Business
Deployment
IT D
eplo
ym
ent
Big Data & Analytics Production Environments
Scalability
Sandbox
Integration
Production
SAS
HWX
SAS
YARN
EP
LASR
Hive
YARN
HiveYARN
Hive
YARN
Hive
Scalability
LASR LASR
LASR
LASR
LASR
EP EP
EP
Hive
YARN
Hive
YARN
Hive
YARN YARN
EP
EP
Hive
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
EP EP EP EP
EP EP EP EP
Simplified Server Architecture SAS and HDP
Data Node 1 Data Node 2 Data Node 3 Data Node x Data Node x+1 Data Node x+2 Data Node y
SAS
In-Memory
LASR
SAS
In-Memory
LASR
SAS
In-Memory
LASR
SAS
In-Memory
LASR
SAS EP SAS EP SAS EP SAS EP SAS EP SAS EP SAS EP
HDFS HDFS HDFS HDFS HDFS HDFS HDFS
SAS
Mmgt &
Metadata
Hadoop
Mmgt &
Metadata
Hive Hive HiveHive Hive Hive Hive
Hadoop
Frontend
SERVER / OS SERVER / OS SERVER / OS
x < y
EP = Embedded Process
bring calculation to data
Ambari Views,
Zeppelin,
17
YARN YARN YARNYARN YARN YARN YARN
Solr Solr SolrSolr Sol Solr Solr
Spark Spark SparkSpark Spark Spark Spark
Lessons learned
Make use of Lab environment
Enable Self-Service
Agile IT-Project management
18
1
2
Automatization
Security
YARN queue management3
4
5
6
Agenda
19
Data Analytics Framework
1Technology
2
Method Example: AI
3Case Study: Cross Selling
4
Artificial Intelligence (AI) is coming
20
Automation of physical tasks
1st Machine Age
Automation of cognitive tasks
2nd Machine Age
Image: used under license from shutterstock.com
Image: dpa Picture Alliance
AI Evolution
21
1997
IBMs deep blue defeats world
chess champion
Purely rule based
2011
IBMs Watson AI system wins
Jeopardy match against
human players
Mixed machine learning and
rule based
2016
Googles DeepMind defeats top
ranked Go player (Lee Se-dol)
Purely machine learning based
Rule based Hybrid Learning based
Image: dpa Picture Alliance / Stan_Honda Image: dpa Picture Alliance / Seth Wenig Image: dpa Picture Alliance / Lee Jin-man
Insurance specific AI
22
Munich Re as industry
leader in insurance-
specific-AI
Insurance
specific AI
General AI Google, Facebook, Microsoft, Open AI