21
The Future of Data Management Amr Awadallah (@awadallah) | Cofounder and CTO with Hadoop and the Enterprise Data Hub

The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

The Future of Data Management

Amr Awadallah (@awadallah) | Cofounder and CTO

with Hadoop and the Enterprise Data Hub

Page 2: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

2 © Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved.

Cloudera Snapshot

Founded 2008, by former employees of

Employees Today ~ 800

World Class Support 24x7 Global Staff Pro-active & Predictive Support Programs

Mission Critical Thousands of Enterprise Users Over 500+ Paying Subscription Customers

The Largest Ecosystem Over 1200+ Partners

Cloudera University Over 100,000+ Trained

Open Source Leaders Cloudera Employees are Leading Developers & Contributors

Total Capital Raised $1B+ (from Intel, Google, Dell, T. Rowe Price, Accel, Greylock)

Mission Help Organizations Leverage the Power of All Their Data to Ask Bigger Questions.

Page 3: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

3 © Cloudera, Inc. All rights reserved.

Why is Big Data Happening Now?

Everything that can be measured will be measured.

Employees and customers expect more personal interactions, but not at the cost of their privacy.

The most innovative companies embrace experimentation and agility.

Instrumentation Consumerization Experimentation

Page 4: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

4 © Cloudera, Inc. All rights reserved.

UNSTRUCTURED DATA

* Source: IDC 2011

2005 2015 2010

1.8 trillion gigabytes of data was created in 2011*

• More than 90% is unstructured data

• Data volume doubles every year

10,000

0

GB

of

Data

(I

N B

ILL

ION

S)

Big Data is Only Getting Bigger

STRUCTURED DATA

Page 5: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

5 © Cloudera, Inc. All rights reserved.

MEDIA / ENTERTAINMENT Viewers / advertising effectiveness

ON-LINE SERVICES / SOCIAL MEDIA People & career matching Website optimization

HEALTH CARE Patient sensors, monitoring, EHRs Quality of care

FINANCIAL SERVICES Risk & portfolio analysis New products

CONSUMER PACKAGED GOODS Sentiment analysis of what’s hot, customer service

TRAVEL & TRANSPORTATION Sensor analysis for optimal traffic flows Customer sentiment

RETAIL Consumer sentiment Optimized marketing

EDUCATION & RESEARCH Experiment sensor analysis

LIFE SCIENCES Clinical trials Genomics

AUTOMOTIVE Auto sensors reporting location, problems

COMMUNICATIONS Location- based advertising

HIGH TECHNOLOGY / INDUSTRIAL MFG. Mfg quality Warranty analysis

UTILITIES Smart Meter analysis for network capacity

OIL & GAS Drilling exploration sensor analysis

LAW ENFORCEMENT & DEFENSE Threat analysis, Social media monitoring, Photo analysis

And It Isn’t Just About Web 2.0 / Social

Page 6: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

6 © Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved.

Expanding Data Requires A New Approach

What we do Copy Data to Applications

What we should do Bring Applications to Data

Data Information-centric

businesses use all Data:

Multi-structured, Internal & external data

of all types

App

App

App

Process-centric businesses use:

• Structured data mainly • Internal data only • “Important” data only • Multiple copies of data

App

App

App

Data

Data

Data

Data

Page 7: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

7 © Cloudera, Inc. All rights reserved.

Hadoop Changes the Game: Storage & Compute Together

©2014 Cloudera, Inc. All rights reserved.

The Hadoop Way The Old Way

$30,000+ per TB

Expensive & Unattainable

• Hard to scale • Network is a bottleneck • Only handles relational data • Difficult to add new fields & data types

Expensive, Special purpose, “Reliable” Servers Expensive Licensed Software

Network

Data Storage (SAN, NAS)

Compute (RDBMS, EDW)

$300-$1,000 per TB

Affordable & Attainable

• Scales out forever • No bottlenecks • Easy to ingest any data • Agile data access

Commodity “Unreliable” Servers Hybrid Open Source Software

Compute (CPU)

Memory Storage (Disk)

z

z

Page 8: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

8 © Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved.

The Old Way: Bringing Data to Applications

Can’t Get a 360 View • Many special-purpose

systems • Moving data around • No complete views

Can’t Retain Valuable Data • Leaving data behind • Risk and compliance • High cost of storage

Can’t Meet ETL SLAs • Up-front modeling • Transforms slow • Transforms lose data

Can’t Ask New Questions • Existing systems strained • No agility • “BI backlog”

4

1

2

3

SERVERS MARTS EDWS DOCUMENTS STORAGE SEARCH ARCHIVE

ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS EXTERNAL DATA SOURCES

Page 9: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

9 © Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved.

The New Way: Bringing Applications to Data

SERVERS MARTS EDWS DOCUMENTS STORAGE SEARCH ARCHIVE

ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS ESTERNAL DATA SOURCES

Consolidated Architecture • Bring applications to data • Combine different workloads on

common data (i.e. SQL + Search) • True analytic agility

4

1

2

3 4

Active Archive • Full fidelity original data • Indefinite time, any source • Lowest cost storage

1

Scalable Transformations • One source of data for all analytics • Persist state of transformed data • Significantly faster & cheaper

2

Agile Exploration • Simple search + BI tools • “Schema on read” agility • Reduce BI user backlog requests

3

Page 10: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

10 © Cloudera, Inc. All rights reserved.

Core Benefits of the Enterprise Data Hub

©2014 Cloudera, Inc. All rights reserved.

• Full-Fidelity Active Archive

• Accelerate Time to Insight (Scale)

• Unlock Agility and Exploration

• Consolidate Silos for 360o View

• Enable Pervasive Analytics

Page 11: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

11 © Cloudera, Inc. All rights reserved.

Cloudera Enterprise powered by Apache Hadoop

A new kind of data platform • One place for unlimited data

• Unified, multi-framework data access

Key Advantages:

• Leading performance

• Enterprise system and data management

• Fundamentally secure

• Open source, open standards

Security and Administration

Unlimited Storage

Process Discover Model Serve

Deployment Flexibility

On-Premises Appliances Engineered Systems

Public Cloud Private Cloud Hybrid Cloud

Page 12: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

12 © Cloudera, Inc. All rights reserved.

One Platform, Many Workloads

Batch, Interactive, and Real-Time. Leading performance and usability in one platform.

• End-to-end analytic workflows

• Access more data

• Work with data in new ways

• Enable new users

Security and Administration

Process

Ingest Sqoop, Flume,

Kafka

Transform MapReduce,

Hive, Pig, Spark

Discover

Analytic Database Impala

Search Solr

Model

Machine Learning SAS, R, Spark, Mahout, Oryx

Serve

NoSQL Database HBase

Streaming Spark Streaming

Unlimited Storage HDFS, HBase

YARN, Cloudera Manager, Cloudera Navigator

Page 13: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

13 © Cloudera, Inc. All rights reserved.

Complement Existing Investments and Skills

BI Integration

• Seamlessly integrate into data analytic vendors

• Push heavy workloads down to Cloudera using analytic SQL capabilities

MicroStrategy Desktop

MicroStrategy Web

MicroStrategy Mobile

MicroStrategy Intelligence Server

Security and Administration

Unlimited Storage

Process Discover Model Serve

Deployment Flexibility

On-Premises Appliances Engineered Systems

Public Cloud Private Cloud Hybrid Cloud

Page 14: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

14 © Cloudera, Inc. All rights reserved.

WEB/MOBILE APPLICATIONS

ONLINE SERVING SYSTEM

ENTERPRISE DATA WAREHOUSE

ENTERPRISE REPORTING BI / ANALYTICS MACHINE

LEARNING CONVERGED

APPLICATIONS CLOUDERA MANAGER

META DATA / ETL TOOLS

ENTERPRISE DATA HUB

The Modern Information Architecture Data Architects System Operators Engineers Data Scientists Analysts Business Users

Customers & End Users

SYS LOGS WEB LOGS FILES RDBMS

Page 15: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

15 © Cloudera, Inc. All rights reserved.

Hadoop Administration Made Easy

Cloudera Manager Focus on the solution, not the cluster, with the only complete, zero-downtime administration tool for Apache Hadoop.

Unique Capabilities:

• Unified configuration, management and monitoring across all services

• Online installation and upgrades

• Direct connection to Cloudera Support

• 3rd Party Extensibility

Page 16: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

16 © Cloudera, Inc. All rights reserved.

Big Data Meets Data Governance

Cloudera Navigator Minimize risk and maintain compliance with the only native end-to-end data governance solution for Apache Hadoop.

Unique Capabilities:

• Auditing

• Lineage

• Metadata Tagging and Discovery

• Lifecycle Management

Page 17: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

17 © Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved.

A High Level View of the Journey

Not Only SQL

Agile Exploration

ETL Acceleration

Operational Efficiency (Faster, Bigger, Cheaper)

Transformative Applications (New Business Value)

Cheap Storage

Business IT

EDW Optimization

Pervasive Analytics

Page 18: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

Premier analyzes $41 billion in healthcare spend, driving recommendations that help providers get better products at lower costs.

Page 19: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

19 © Cloudera, Inc. All rights reserved.

Allstate Builds A Universal Data Archive The Challenge: • Data silos spread across company with 80+ years historical data; only some

digitized • Analysis on one state’s data takes 24 hours; can’t analyze all 50 states at once

Allstate optimizes offers and pricing with a comprehensive view of individual risk.

The Solution:

• Universal data archive on Cloudera Enterprise spans enterprise-wide systems

• 3 use cases: storage, ETL, applied math

• Analyze all 50 states in 16 hours using Hive; 500X speed-up

©2014 Cloudera, Inc. All rights reserved. 19

Page 20: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

Thank you!

Page 21: The Future of Data Management - MicroStrategy...with Hadoop and the Enterprise Data Hub ©2014 Cloudera, Inc. All rights ... ENTERPRISE DATA WAREHOUSE ENTERPRISE BI / ANALYTICS REPORTING

21 © Cloudera, Inc. All rights reserved.

Why Cloudera?

Enterprise-Grade Hadoop Differentiated performance, security, management, and governance.

Expertise No one knows Hadoop better than Cloudera.

Enablement Support, Training, and Professional Services enable and deliver success.

Ecosystem Cloudera ensures that Hadoop works with the platforms, tools, and integrators you rely on.

Sustainable Innovation Our hybrid open source model delivers the benefits of open source and what the enterprise requires, while enabling us to invest in the future for our customers.