26
1 Cloudera Hadoop & Industrie 4.0 – wohin mit dem Datenstrom? Bernard Doering Regional Sales Director, Central Europe

Cloudera Hadoop& Industrie4.0– wohinmitdemDatenstrom? · HADOOP" 2011 CLOUDERAREACHES! 100!PRODUCTION! CUSTOMERS! 2011 ... Accelerate evolution of Hadoop by joining forces on foundational

  • Upload
    vohuong

  • View
    215

  • Download
    2

Embed Size (px)

Citation preview

1  

       Cloudera  Hadoop  &  Industrie  4.0  –                    wohin  mit  dem  Datenstrom?                  Bernard  Doering                          Regional  Sales  Director,  Central  Europe  

2  

Cloudera  Hadoop  

©2014  Cloudera,  Inc.  All  rights  reserved.  2  

Scalable  

Flexible  

Open  

Cost-­‐EffecLve  

3  

Hadoop  vs  RelaLonal  Databases  

©2014  Cloudera,  Inc.  All  rights  reserved.  3  

“Schema-­‐on-­‐Write”   “Schema-­‐on-­‐Read”  

§  Schema  must  be  created  before  any  data  can  be  loaded  

§  Reads  are  fast  

§  Standards  and  Governance  

§  Data  is  simply  copied  to  the  file  store,  no  transformaLon  is  needed  

§  Loads  are  fast  

§  Flexibility  and  agility  

4  

Cloudera  Company  Snapshot  

©2014  Cloudera,  Inc.  All  rights  reserved.  

Founded  2008,  by  former  employees  of  Funding  >$1B  invested  in  opportunity,  ~$670M  Primary  Employees  Today  ~740  World  Class  Support  More  than  70  -­‐  24x7  Global  Product  Support  Staff  

Pro-­‐acLve  &  PredicLve  Support  Programs  using  our  EDH  Mission  Cri:cal  ProducLon  deployments  in  run-­‐the-­‐business  applicaLons  

worldwide  –  Financial  Services,  Retail,  Telecom,  Media,  Health  Care,  Energy,  Government,  Manufacturing  

The  Largest  Ecosystem  More  than  1,000  Partners  Cloudera  University  Over  40,000  IT  engineers  trained  Open  Source  Leaders  Cloudera  Employees  are  Leading  Developers  &  Contributors  

to  the  complete  Apache  Hadoop  ecosystem  of  projects.  

5  5  

Leading  the  Way  in  Data  Management  Powered  by  Hadoop  2008  CLOUDERA  FOUNDED  BY  MIKE  OLSON  AMR  AWADALLAH  &  JEFF  HAMMERBACHER  

2009  HADOOP  CREATOR  

DOUG  CUTTING  JOINS  CLOUDERA  

2009  CLOUDERA  RELEASES  CDH  THE  FIRST  COMMERCIAL    APACHE  HADOOP  DISTRIBUTION  

2010  CLOUDERA  MANAGER:  FIRST  MANAGEMENT  

APPLICATION  FOR  HADOOP  

2011  CLOUDERA  REACHES  100  PRODUCTION  CUSTOMERS  

2011  CLOUDERA  UNIVERSITY  

EXPANDS  TO  140  COUNTRIES  

2012  CLOUDERA  ENTERPRISE  4  THE  STANDARD  FOR  HADOOP  IN  THE  ENTERPRISE  

2012  CLOUDERA  CONNECT  

REACHES  300  PARTNERS  

2014  THE  ENTERPRISE  DATA  HUB  LAUNCHED  

2013  CLOUDERA  IMPALA  CLOUDERA  NAVIGATOR  CLOUDERA  SEARCH    

2013  TOM  REILLY  JOINS  AS  CEO  

OVER  800  PARTNERS    IN  CLOUDERA  CONNECT  

CDH Cloudera Manager

CLOUDERA  ENTERPRISE  

4  ASK  BIGGER  QUESTIONS  

ENTERPRISE  DATA  HUB  

6  

Key  A&ributes    

Ø  Secure,  Governed,  and  Compliant  

Ø  Unified  and    Managed  

 

Ø  Open  Architecture  and  Scalable  

Ø  Open-­‐Source  and  Cost-­‐Effec:ve  

Hadoop  and  the  Enterprise  Data  Hub  An  Open-­‐Source  Data  Engine  at  the  Core  and  Built  for  the  Modern  Enterprise  

©2014  Cloudera,  Inc.  All  rights  reserved.  

3RD  PARTY  APPS  

STORAGE  FOR  ANY  TYPE  OF  DATA  UNIFIED,  ELASTIC,  RESILIENT,  SECURE  

         

CLOUDERA’S  ENTERPRISE  DATA  HUB  

BATCH  PROCESSING  

MAPREDUCE  

ANALYTIC  SQL  

IMPALA  

SEARCH  ENGINE  

SOLR  

MACHINE  LEARNING  

SPARK  

STREAM  PROCESSING  SPARK  STREAMING  

WORKLOAD  MANAGEMENT   YARN  

FILESYSTEM  HDFS  

ONLINE  NOSQL  HBASE  

DATA  MAN

AGEMEN

T  CLO

UDERA  N

AVIGATO

R  

SYSTEM  

MAN

AGEMEN

T  CLO

UDERA  M

ANAG

ER  SENTRY  ,  SECURE  

7  

Cloudera  &  The  Intel  Alliance  

Intel Confidential 8

Big Deal: Cloudera + Intel Intel invests $740M in Cloudera §  As Intel’s largest data center venture capital investment, which represents Intel’s

commitment to Internet of Things and Big Data §  Supports Cloudera’s ability to remain independent

Intel & Cloudera drive innovation through open source §  Accelerate evolution of Hadoop by joining forces on foundational technologies §  Enable open source developers to innovate in and on top of the Hadoop platform

Intel enables CDH to run best on Intel Architecture – performance optimisation §  Enables Cloudera to make best use of Intel data center technologies §  Provides datacenter infrastructure for Cloudera development & benchmarking at scale

Intel Confidential 9

Big Goal: Converge on one open source platform

•  Most stable, compatible, and mature Hadoop distribution

•  Leading SQL functionality & performance (Impala)

•  Deepest management and governance capabilities

•  150 Hadoop developers •  100 open source committers

•  The only distribution with performance and security enhanced from the silicon up

•  Leading security capabilities including encryption, access control, and auditing

•  50 Hadoop developers and 12 committers

•  Long-standing committment to open source with 1000 developers working on Linux, KVM, Xen, Java, OpenStack, Hadoop

Intel Confidential

Cloudera for Big Data

11  

Data  drives  innovaLon  –  Internet  of  Things    

INTELLIGENT CLOUD

Richer data to analyze

2.8 Zettabytes of data generated WW in 20121

SMART CLIENTS

Richer user experiences

Richer data from devices

INTELLIGENT THINGS

Sources: (1) IDC Digital Universe 2020, (2) IDC

40 Zettabytes of data will be generated WW in

20201

12  

Big  Data  is  All  Data  and  All  Paradigms    

Transac:onal  &  Applica:on  Data  

Machine  Data   Social  Data      

•   Volume    

•   Structured  •   Throughput    

•   Velocity    •   Semi-­‐structured      

•   IngesLon    

•   Variety  •   Highly  unstructured    •   Veracity  

Enterprise  Content  

•   Variety  •   Highly  unstructured  •   Volume    

13  

Expanding  Data  Requires  A  New  Approach  

©2014  Cloudera,  Inc.  All  rights  reserved.  13  

1980s  Bring  Data  to  Compute  

Now  Bring  Compute  to  Data  

Rela:ve  size  &  complexity  

Data  Informa:on-­‐centric  

businesses  use  all  data:      

MulL-­‐structured,    internal  &  external  data    

of  all  types  

Compute  

Compute  

Compute  

Process-­‐centric    businesses  use:  

 

• Structured  data  mainly  •  Internal  data  only  

• “Important”  data  only      

Compute  

Compute  

Compute  

Data  

Data  

Data  

Data  

14   ©2014  Cloudera,  Inc.  All  rights  reserved.  

The  Old  Way:  Moving  Data  to  Compute  Huge  Investment  in  Specialized  Systems  that  Treat  Data  as  a  Commodity  

SERVERS  MARTS  EDWS   DOCUMENTS   STORAGE   SEARCH   ARCHIVE  

ERP,  CRM,  RDBMS,  MACHINES   FILES,  IMAGES,  VIDEOS,  LOGS,  CLICKSTREAMS   EXTERNAL  DATA  SOURCES  

Major  Challenges    

Missing  Data  •  Leaving  data  behind  •  Risk  and  compliance  •  High  cost  of  storage    

Complex  Architecture  •  Many  special-­‐purpose  systems  •  Moving  data  around  •  No  complete  views  

Cost  of  Analy:cs  •  ExisLng  systems  strained  •  No  agility  •  “BI  backlog”  

Time  to  Data  •  Up-­‐front  modeling  •  Transforms  slow  •  Transforms  lose  data  

15   ©2014  Cloudera,  Inc.  All  rights  reserved.  

The  Old  Way:  Siloed  Business  FuncLons  Lack  of  CoordinaLon  Increases  Opportunity  Costs  and  Decreases  Data  Availability  

TRANSACTIONAL  RISK  MARKETING   LENDING   CREDIT  CARDS   INVESTMENT  

CUSTOMER  DATA  TRANSACTIONS   MARKET  DATA   RESEARCH  LOGS  

BACK  OFFICE  

Major  Challenges    Ø  Poor  Visibility  

Ø  Inefficiency  

Ø  Extreme  Cost  

Ø  Complexity  

   

16  

The  New  Way:  Bringing  Compute  to  Data  Maximize  Benefit  from  All  Your  Data  for  Mission-­‐CriLcal  Jobs  and  InnovaLon    

SERVERS   MARTS   EDWS   DOCUMENTS   STORAGE   SEARCH   ARCHIVE  

ERP,  CRM,  RDBMS,  MACHINES   FILES,  IMAGES,  VIDEOS,  LOGS,  CLICKSTREAMS   EXTERNAL  DATA  SOURCES  

©2014  Cloudera,  Inc.  All  rights  reserved.  

Major  Benefits    

Ac:ve  Compliance  Archive  •  Full  fidelity  original  data  •  Indefinite  Lme,  any  source  •  Lowest  cost  storage    

Diverse  Analy:c  Plaaorm  •  Bring  applicaLons  to  data  •  Combine  different  workloads  on    

common  data  (i.e.  SQL  +  Search)  •  True  analy=c  agility  

Self-­‐Service  Exploratory  BI  •  Simple  search  +  BI  tools  •  “Schema  on  read”  agility  •  Reduce  BI  user  backlog  requests  

Persistent  Storage  •  One  source  of  data  for  all  analyLcs  •  Persist  state  of  transformed  data  •  Significantly  faster  &  cheaper  

17   ©2014  Cloudera,  Inc.  All  rights  reserved.  

The  New  Way:  Bring  Business  FuncLons  to  Data  Consolidate  Relevant  Services  and  Data  in  MulL-­‐tenant  Environment  

MARKETING   BACK  OFFICE   LOGS  RESEARCH  

TRANSACTIONS  MARKET  

INVESTMENT   TRANSACTIONAL   LENDING   CREDIT  CARDS  

RISK   CUSTOMER  

360o  VIEW  

Major  Benefits    Ø  Compliant  

Ø  Centralized  

Ø  Self-­‐Service  

Ø  Mul:ple  Workloads  

 

   

18  

WEB/MOBILE  APPLICATIONS  

ONLINE  SERVING  SYSTEM  

ENTERPRISE  DATA  WAREHOUSE    

ENTERPRISE  REPORTING  BI  /  ANALYTICS  MACHINE  

LEARNING  CONVERGED  APPLICATIONS  

CLOUDERA  MANAGER  

META  DATA  /    ETL  TOOLS  

ENTERPRISE  DATA  HUB  

©2014  Cloudera,  Inc.  All  Rights  Reserved.  

The  Modern  InformaLon  Architecture  Data  Architects   System  Operators   Engineers   Data  Scien:sts   Analysts   Business  Users  

Customers  &  End  Users  

SYS  LOGS   WEB  LOGS   FILES   RDBMS  

19  

Sample  Use  Cases  

20   ©2014  Cloudera,  Inc.  All  rights  reserved.  

Insurance  

Use  Case  

Problem  

Solu/on  

Partners  

360o  View  DifferenLate  coverage  opLons  by  customizing  plans  based  on  informaLon  collected  about  customers’  lifestyle,  health  paterns,  habits,  and  preferences.  

Can’t  Scale  for  Sensor  Data    Current  systems  can  not  integrate  telemetric  and  sensor  data  delivered  in  real  Lme  with  historical  data  to  tailor  policies  and  incenLve  plans  to  the  user.  

Stream  Processing  Spark  Streaming  is  used  to  calculate  pricing  occasions  in  real  Lme  based  on  live,  unstructured  data-­‐in-­‐moLon  from  sensors,  mobile  devices,  nanotechnology,  etc.  

21  

22  

Streamlining  drivers  customer  experience  

Challenge  •  Each  vehicle  is  comprised  of  thousands  or  millions  of  components,  many  streaming  machine  data    

•  Want  to  build  loyalty  by  minimizing  maintenance  issues  

Solu:on  •  Improved  customer  loyalty  

through  proacLve  care    •  Cloudera  correlates  

manufacturing  data  with  customer  informaLon  

•  PredicLve  analyLcs  &  machine  learning  enable  dynamic  customer  profiles  &  personalizaLon  

Auto Manufacturer

23  

Manufacturing  –  IoT  Trends    

Connected  Car  and  Smart  Meter  Grids    Value-­‐added  Services  &  Apps:  •   Customer  micro  segmentaLon  and  loyalty  •  Alerts  •  Pro-­‐acLve  maintenance    •  Quality  Improvement  •  Operator  Services  •  Performance  opLmisaLon,  e.g.  fuel  or  power  consumpLon    

23  

24  

Customer  Success  Across  Industries  Financial  &  Business  Services  Telecom  &    Technology  Healthcare  &  Life  Sciences  Media  &  InformaLon  Retail  &  Consumer  Energy  &    Public  Sector  

©2014  Cloudera,  Inc.  All  rights  reserved.      

25  

BI  and  AnalyLcs  Partners  

Enabling  The  App  Store  of  Big  Data  

SI,  Cloud,  MSP  Partners  

Database  Partners  Resellers  

Data  IntegraLon  Partners  Hardware  Partners  

©2014  Cloudera,  Inc.  All  rights  reserved.      

26  

Thank  You!    Bernard  Doering  [email protected]  Tel.    +49  172  692  9837