Upload
vohuong
View
215
Download
2
Embed Size (px)
Citation preview
1
Cloudera Hadoop & Industrie 4.0 – wohin mit dem Datenstrom? Bernard Doering Regional Sales Director, Central Europe
2
Cloudera Hadoop
©2014 Cloudera, Inc. All rights reserved. 2
Scalable
Flexible
Open
Cost-‐EffecLve
3
Hadoop vs RelaLonal Databases
©2014 Cloudera, Inc. All rights reserved. 3
“Schema-‐on-‐Write” “Schema-‐on-‐Read”
§ Schema must be created before any data can be loaded
§ Reads are fast
§ Standards and Governance
§ Data is simply copied to the file store, no transformaLon is needed
§ Loads are fast
§ Flexibility and agility
4
Cloudera Company Snapshot
©2014 Cloudera, Inc. All rights reserved.
Founded 2008, by former employees of Funding >$1B invested in opportunity, ~$670M Primary Employees Today ~740 World Class Support More than 70 -‐ 24x7 Global Product Support Staff
Pro-‐acLve & PredicLve Support Programs using our EDH Mission Cri:cal ProducLon deployments in run-‐the-‐business applicaLons
worldwide – Financial Services, Retail, Telecom, Media, Health Care, Energy, Government, Manufacturing
The Largest Ecosystem More than 1,000 Partners Cloudera University Over 40,000 IT engineers trained Open Source Leaders Cloudera Employees are Leading Developers & Contributors
to the complete Apache Hadoop ecosystem of projects.
5 5
Leading the Way in Data Management Powered by Hadoop 2008 CLOUDERA FOUNDED BY MIKE OLSON AMR AWADALLAH & JEFF HAMMERBACHER
2009 HADOOP CREATOR
DOUG CUTTING JOINS CLOUDERA
2009 CLOUDERA RELEASES CDH THE FIRST COMMERCIAL APACHE HADOOP DISTRIBUTION
2010 CLOUDERA MANAGER: FIRST MANAGEMENT
APPLICATION FOR HADOOP
2011 CLOUDERA REACHES 100 PRODUCTION CUSTOMERS
2011 CLOUDERA UNIVERSITY
EXPANDS TO 140 COUNTRIES
2012 CLOUDERA ENTERPRISE 4 THE STANDARD FOR HADOOP IN THE ENTERPRISE
2012 CLOUDERA CONNECT
REACHES 300 PARTNERS
2014 THE ENTERPRISE DATA HUB LAUNCHED
2013 CLOUDERA IMPALA CLOUDERA NAVIGATOR CLOUDERA SEARCH
2013 TOM REILLY JOINS AS CEO
OVER 800 PARTNERS IN CLOUDERA CONNECT
CDH Cloudera Manager
CLOUDERA ENTERPRISE
4 ASK BIGGER QUESTIONS
ENTERPRISE DATA HUB
6
Key A&ributes
Ø Secure, Governed, and Compliant
Ø Unified and Managed
Ø Open Architecture and Scalable
Ø Open-‐Source and Cost-‐Effec:ve
Hadoop and the Enterprise Data Hub An Open-‐Source Data Engine at the Core and Built for the Modern Enterprise
©2014 Cloudera, Inc. All rights reserved.
3RD PARTY APPS
STORAGE FOR ANY TYPE OF DATA UNIFIED, ELASTIC, RESILIENT, SECURE
CLOUDERA’S ENTERPRISE DATA HUB
BATCH PROCESSING
MAPREDUCE
ANALYTIC SQL
IMPALA
SEARCH ENGINE
SOLR
MACHINE LEARNING
SPARK
STREAM PROCESSING SPARK STREAMING
WORKLOAD MANAGEMENT YARN
FILESYSTEM HDFS
ONLINE NOSQL HBASE
DATA MAN
AGEMEN
T CLO
UDERA N
AVIGATO
R
SYSTEM
MAN
AGEMEN
T CLO
UDERA M
ANAG
ER SENTRY , SECURE
Intel Confidential 8
Big Deal: Cloudera + Intel Intel invests $740M in Cloudera § As Intel’s largest data center venture capital investment, which represents Intel’s
commitment to Internet of Things and Big Data § Supports Cloudera’s ability to remain independent
Intel & Cloudera drive innovation through open source § Accelerate evolution of Hadoop by joining forces on foundational technologies § Enable open source developers to innovate in and on top of the Hadoop platform
Intel enables CDH to run best on Intel Architecture – performance optimisation § Enables Cloudera to make best use of Intel data center technologies § Provides datacenter infrastructure for Cloudera development & benchmarking at scale
Intel Confidential 9
Big Goal: Converge on one open source platform
• Most stable, compatible, and mature Hadoop distribution
• Leading SQL functionality & performance (Impala)
• Deepest management and governance capabilities
• 150 Hadoop developers • 100 open source committers
• The only distribution with performance and security enhanced from the silicon up
• Leading security capabilities including encryption, access control, and auditing
• 50 Hadoop developers and 12 committers
• Long-standing committment to open source with 1000 developers working on Linux, KVM, Xen, Java, OpenStack, Hadoop
11
Data drives innovaLon – Internet of Things
INTELLIGENT CLOUD
Richer data to analyze
2.8 Zettabytes of data generated WW in 20121
SMART CLIENTS
Richer user experiences
Richer data from devices
INTELLIGENT THINGS
Sources: (1) IDC Digital Universe 2020, (2) IDC
40 Zettabytes of data will be generated WW in
20201
12
Big Data is All Data and All Paradigms
Transac:onal & Applica:on Data
Machine Data Social Data
• Volume
• Structured • Throughput
• Velocity • Semi-‐structured
• IngesLon
• Variety • Highly unstructured • Veracity
Enterprise Content
• Variety • Highly unstructured • Volume
13
Expanding Data Requires A New Approach
©2014 Cloudera, Inc. All rights reserved. 13
1980s Bring Data to Compute
Now Bring Compute to Data
Rela:ve size & complexity
Data Informa:on-‐centric
businesses use all data:
MulL-‐structured, internal & external data
of all types
Compute
Compute
Compute
Process-‐centric businesses use:
• Structured data mainly • Internal data only
• “Important” data only
Compute
Compute
Compute
Data
Data
Data
Data
14 ©2014 Cloudera, Inc. All rights reserved.
The Old Way: Moving Data to Compute Huge Investment in Specialized Systems that Treat Data as a Commodity
SERVERS MARTS EDWS DOCUMENTS STORAGE SEARCH ARCHIVE
ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS EXTERNAL DATA SOURCES
Major Challenges
Missing Data • Leaving data behind • Risk and compliance • High cost of storage
Complex Architecture • Many special-‐purpose systems • Moving data around • No complete views
Cost of Analy:cs • ExisLng systems strained • No agility • “BI backlog”
Time to Data • Up-‐front modeling • Transforms slow • Transforms lose data
15 ©2014 Cloudera, Inc. All rights reserved.
The Old Way: Siloed Business FuncLons Lack of CoordinaLon Increases Opportunity Costs and Decreases Data Availability
TRANSACTIONAL RISK MARKETING LENDING CREDIT CARDS INVESTMENT
CUSTOMER DATA TRANSACTIONS MARKET DATA RESEARCH LOGS
BACK OFFICE
Major Challenges Ø Poor Visibility
Ø Inefficiency
Ø Extreme Cost
Ø Complexity
16
The New Way: Bringing Compute to Data Maximize Benefit from All Your Data for Mission-‐CriLcal Jobs and InnovaLon
SERVERS MARTS EDWS DOCUMENTS STORAGE SEARCH ARCHIVE
ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS EXTERNAL DATA SOURCES
©2014 Cloudera, Inc. All rights reserved.
Major Benefits
Ac:ve Compliance Archive • Full fidelity original data • Indefinite Lme, any source • Lowest cost storage
Diverse Analy:c Plaaorm • Bring applicaLons to data • Combine different workloads on
common data (i.e. SQL + Search) • True analy=c agility
Self-‐Service Exploratory BI • Simple search + BI tools • “Schema on read” agility • Reduce BI user backlog requests
Persistent Storage • One source of data for all analyLcs • Persist state of transformed data • Significantly faster & cheaper
17 ©2014 Cloudera, Inc. All rights reserved.
The New Way: Bring Business FuncLons to Data Consolidate Relevant Services and Data in MulL-‐tenant Environment
MARKETING BACK OFFICE LOGS RESEARCH
TRANSACTIONS MARKET
INVESTMENT TRANSACTIONAL LENDING CREDIT CARDS
RISK CUSTOMER
360o VIEW
Major Benefits Ø Compliant
Ø Centralized
Ø Self-‐Service
Ø Mul:ple Workloads
18
WEB/MOBILE APPLICATIONS
ONLINE SERVING SYSTEM
ENTERPRISE DATA WAREHOUSE
ENTERPRISE REPORTING BI / ANALYTICS MACHINE
LEARNING CONVERGED APPLICATIONS
CLOUDERA MANAGER
META DATA / ETL TOOLS
ENTERPRISE DATA HUB
©2014 Cloudera, Inc. All Rights Reserved.
The Modern InformaLon Architecture Data Architects System Operators Engineers Data Scien:sts Analysts Business Users
Customers & End Users
SYS LOGS WEB LOGS FILES RDBMS
20 ©2014 Cloudera, Inc. All rights reserved.
Insurance
Use Case
Problem
Solu/on
Partners
360o View DifferenLate coverage opLons by customizing plans based on informaLon collected about customers’ lifestyle, health paterns, habits, and preferences.
Can’t Scale for Sensor Data Current systems can not integrate telemetric and sensor data delivered in real Lme with historical data to tailor policies and incenLve plans to the user.
Stream Processing Spark Streaming is used to calculate pricing occasions in real Lme based on live, unstructured data-‐in-‐moLon from sensors, mobile devices, nanotechnology, etc.
22
Streamlining drivers customer experience
Challenge • Each vehicle is comprised of thousands or millions of components, many streaming machine data
• Want to build loyalty by minimizing maintenance issues
Solu:on • Improved customer loyalty
through proacLve care • Cloudera correlates
manufacturing data with customer informaLon
• PredicLve analyLcs & machine learning enable dynamic customer profiles & personalizaLon
Auto Manufacturer
23
Manufacturing – IoT Trends
Connected Car and Smart Meter Grids Value-‐added Services & Apps: • Customer micro segmentaLon and loyalty • Alerts • Pro-‐acLve maintenance • Quality Improvement • Operator Services • Performance opLmisaLon, e.g. fuel or power consumpLon
23
24
Customer Success Across Industries Financial & Business Services Telecom & Technology Healthcare & Life Sciences Media & InformaLon Retail & Consumer Energy & Public Sector
©2014 Cloudera, Inc. All rights reserved.
25
BI and AnalyLcs Partners
Enabling The App Store of Big Data
SI, Cloud, MSP Partners
Database Partners Resellers
Data IntegraLon Partners Hardware Partners
©2014 Cloudera, Inc. All rights reserved.