1
HADOOP POSITIVES HADOOP NEGATIVES RESPONSE LOAD COMPLEX WORKLOAD ECONOMICS STRUCTURE See EMA's knowledge in action. Read more at http://research.enterprisemanagement.com/bigData.html EMA world wide survey respondents said that schema flexibility was an issue with the following platforms: Operational Platforms Data Warehouse /Data Mart Analytical Platforms /Appliances Enterprise data is growing at exponential rates. A majority of these new data sources are creating vast amounts of new information. (20%) Hadoop's MapReduce processing engine is batch and not real-time Hadoop is still a relatively young technology and still maturing Hadoop is "free" like a "free puppy". Hadoop clusters require significant amounts of administration and training to operate 26% Big Data solutions are driven by five BUSINESS REQUIREMENTS. Online applications and mobile geo-location businesses are driven by a speed of response. This includes: Require faster processing of multi-structured data sets Require faster reaction to streaming event systems PETABYTE 2013 Overcoming obstacles of traditional systems due to processing power and data storage limitations. 35% 32% 36% Needed “deep” visibility into operational transaction data like clicktream or point of sale Required higher levels of advanced analytic processing Needed to move from sample data sets to full dataset analysis 39 % 32 % 35% 43% 40% Copyright 2013, EMA Inc. All Rights Reserved. Operational Platforms Data Warehouse / Data Mart EMA’s Hybrid Data Ecosystem represents 8 types of platforms that can work together to address the business drivers powering Big Data solutions. BIG DATA Operational platforms have been optimized on the third normal form (3NF) structured schema. This approach is not well suited to variable data types. Operational Platforms Analytical Platforms Data Warehouse/ Data Mart Hadoop's parallel processing engine provides the ability to perform large workloads Hadoop scales to large capacity across multiple nodes Over a quarter of EMA worldwide survey respondents are implementing NoSQL platforms like Hadoop Hadoop and MapReduce are not well designed for online numerical analytics using SQL Real-time operational response time Stretching the boundaries of traditional systems and infrastructure Right-time analytics on large datasets EMA world wide survey respondents said that speed of response is a primary driver of Big Data strategies Nearly one fifth of respondents to a world wide EMA end-user survey indicated that their Big Data environments are between The following are the top business challenges being addressed by organizations using complex processing: The economics of technology is the great equalizer and often can contribute to an early majority adoption of a particular innovation. This has been especially true with Big Data. Many companies have focused on return on investment (ROI) regarding Big Data adoption. Big Data platforms can leverage commodity hardware and often the software is open source, lowering the economic barriers to entry. of Big Data solution architects say that legacy platforms are economically unable to meet Big Data challenges. of IT project sponsors of Big Data need to lower total cost of ownership (TCO) of data management platforms. HADOOP DOES NOT EQUAL BIG DATA Hadoop is a great new technology, but not the only answer to Big Data questions Architects find that high latency in processing is a hurdle to their implementation of Big Data solutions when using the following platforms Big Data program sponsors indicated operational and capital cost issues associated with the following platforms: 50 % 44 % 52 % Highly developed data models and schemas in data warehouses and data marts make changes to data structure a long, difficult process to implement. Analytical platforms have been optimized for numerical analytical queries on structured data. Using variable data formats such as pictures and documents are troublesome. As an open source platform, Hadoop is economical to install Require faster response time of operational or analytical data queries Speed in data management processing creates competitive advantage 12- 40TB Operational Platforms Data Warehouse/Data Mart Analytical Platforms 47% 42% 36% Organizations are faced with increased diversity of data structures. This includes relational structures and multi-structured JSON formats as well as documents, images and video files. Enterprise Management Associates Proudly Presents While Hadoop as a technology platform has opened the eyes of many to the world of Big Data, it is not the only option available to handle the future flood of multi-structured datasets and workloads coming from web-based applications, mobile devices, telematic sensor information and social applications. Big Data has found a home across a wide selection of technology platforms, including Hadoop. However, Big Data implementation strategies are not driven simply by technology .... 40% 38% 0 10 20 30 40 50 41% 37% 33% 33% Asset optimization for portfolio management, staff planning for human resources, logistical management for transportation Fraud analysis for retail, liquidity risk assessment for financial services, risk mitigation for CFO. Patient segmentation for healthcare; market basket analysis for retail; cross-sell/up-sell treatment for online and consumer products. Customer churn prediction for business to consumer relationships, click analysis for online retailing, showroom behavior analysis for consumer product and retail. 1 2 3 3 # # # # Data Loads are growing not just in size, but in diversity and complexity. The power of Big Data platforms to persist a mixture of data creates an opportunity to address both analytic and operational scenarios. Without this data to fuel these workloads, it would be impossible to execute against the growing demands of enterprise applications and analytic environments. EMA research respondents indicated that complex workloads and processing drove their business requirements for Big Data solutions and architectures Organizations implementing Big Data solutions said that hurdles with the following platforms had issues with complex processing workloads. The need for Big Data platforms to provide new speeds and scale of Response has opened the door for new ways to leverage data and provide insights to end users. This is especially true in the area of Big Data analytics where the ability to react in near real time is a key component to the value these platforms can deliver. Sub-second data delivery is not necessary for all applications and data driven scenarios, but it is clear that real-time use cases are growing in importance and becoming more critical to many companies. New Big Data technologies are at the core of this evolution, and powering new solutions and improved time to action. Operational Platforms Data Warehouse Data Mart Discovery Platform NoSQL Platforms Cloud-Based Platforms Analytical Platforms Hadoop Requirements Economics Load Structure Response Complex workload

Hadoop Does Not Equal Big Data

Embed Size (px)

DESCRIPTION

Leveraging research findings from EMA's 2012 "Big Data Comes of Age" Research Report, this new Infographic outlines the five business requirements driving Big Data solutions and the technologies that support those requirements.

Citation preview

Page 1: Hadoop Does Not Equal Big Data

HADOOPPOSITIVES

HADOOPNEGATIVES

RESPONSE

LOAD

COMPLEX WORKLOAD

ECONOMICS

STRUCTURE

See EMA's knowledge in action. Read more at http://research.enterprisemanagement.com/bigData.html

EMA world wide survey respondents said that schema flexibility was an issue with the following platforms:

Operational Platforms

Data Warehouse/Data Mart

Analytical Platforms/Appliances

Enterprise data is growing at exponential rates. A majority of these new data sources are creating vast amounts of new information.

(20%)

Hadoop's MapReduce processing engine is batch and not real-time

Hadoop is still a relatively young technology and still maturing

Hadoop is "free" like a "free puppy". Hadoop clusters require significant amounts of administration and training to operate

26%

Big Data solutions are driven by five BUSINESS REQUIREMENTS.

Online applications and mobile geo-location businessesare driven by a speed of response. This includes:

Require faster processing of multi-structured data sets

Require faster reaction to

streaming event systems

PETABYTE

2013

Overcoming obstacles of traditional systems due to processing power and data storage limitations.

35%

32%

36%

Needed “deep” visibility into operational transaction data like clicktream or point of sale

Required higher levels of advanced analytic processing

Needed to move from sample data sets to full dataset analysis

39% 32%

35% 43% 40%

Copyright 2013, EMA Inc. All Rights Reserved.

Operational Platforms

Data Warehouse / Data Mart

EMA’s Hybrid Data Ecosystem represents 8 types of platforms that can work together to address the business drivers powering Big Data solutions.

BIG DATA

Operational platforms have been optimized on the third

normal form (3NF) structured schema. This

approach is not well suited to variable data types.

Operational Platforms Analytical PlatformsData Warehouse/ Data Mart

Hadoop's parallel processing engine provides the ability to perform large

workloads

Hadoop scales to large capacity across multiple nodes

Over a quarter of EMA worldwide survey respondents are implementing NoSQL platforms like Hadoop

Hadoop and MapReduce are not well designed for online numerical analytics using SQL

Real-timeoperational response time

Stretching the boundariesof traditional systems andinfrastructure

Right-timeanalytics onlarge datasets

EMA world wide survey respondents said that speed of response is a primary driver of Big Data strategies

Nearly one fifth of respondents to a world wide EMA end-user survey indicated that their Big Data environments are between

The following are the top business challenges being addressed by organizations using complex processing:

The economics of technology is the great equalizer and often can contribute to an early majority adoption of a particular innovation. This has been especially true with Big Data.

Many companies have focused on return on investment (ROI) regarding Big Data adoption. Big Data platforms can leverage commodity hardware and often the software is open source, lowering the economic barriers to entry.

of Big Data solution architects say that legacy platforms are economically unable to meet Big Data challenges.

of IT project sponsors of Big Data need to lower total cost of ownership (TCO) of data management platforms.

HADOOP DOES NOT EQUAL BIG DATA

Hadoop is a great new technology, but not the only answer to Big Data questions

Architects find that high latency in processing is a hurdle to their implementation of Big Data solutions when using the following platforms

Big Data program sponsors indicated operational and capital cost issues associated with the following platforms:

50% 44% 52%

Highly developed data models and schemas in

data warehouses and data marts make changes to data structure a long,

difficult process to implement.

Analytical platforms have been optimized for

numerical analytical queries on structured data.

Using variable data formats such as pictures

and documents are troublesome.

As an open source platform, Hadoop is economical to install

Require faster response time of

operational or analytical data

queries

Speed in data management processing creates competitive advantage

12-40TB

Operational Platforms

Data Warehouse/Data Mart

Analytical Platforms

47%

42%

36%

Organizations are faced with increased diversity of data structures. This includes relational structures and multi-structured JSON formats as well as documents, images and video files.

Enterprise Management Associates Proudly

Presents

While Hadoop as a technology platform has opened the eyes of many to the world of Big Data, it is not the only option available to handle the future flood of multi-structured datasets and workloads coming from web-based applications, mobile devices, telematic sensor information and social applications. Big Data has found a home across a wide selection of technology platforms, including Hadoop. However, Big Data implementation strategies are not driven simply by technology....

40% 38%

0 10 20 30 40 50

41%

37%

33%

33%

Asset optimization for portfolio management, staff planning for human resources, logistical

management for transportation

Fraud analysis for retail, liquidity risk assessment for financial services, risk mitigation for CFO.

Patient segmentation for healthcare; market basket analysis for retail; cross-sell/up-sell

treatment for online and consumer products.

Customer churn prediction for business to consumer relationships, click analysis for online

retailing, showroom behavior analysis for consumer product and retail.

1

2

3

3

#

#

#

#

Data Loads are growing not just in size, but in diversity and complexity. The power of Big Data platforms to persist a mixture of data creates an opportunity to address both analytic and operational scenarios. Without this data to fuel these workloads, it would be impossible to execute against the growing demands of enterprise applications and analytic environments.

EMA research respondents indicated that complex workloads and processing drove their business requirements for Big Data solutions and architectures

Organizations implementing Big Data solutions said that hurdles with the following platforms had issues with complex processing workloads.

The need for Big Data platforms to provide new speeds and scale of Response has opened the door for new ways to leverage data and

provide insights to end users. This is especially true in the area of BigData analytics where the ability to react in near real time is a key component

to the value these platforms can deliver. Sub-second data delivery is not necessary for all applications and data driven scenarios, but it is clear that

real-time use cases are growing in importance and becoming more critical to many companies. New Big Data technologies are at the core of this evolution, and powering new solutions and improved time to action.

Operational Platforms

Data Warehouse

Data Mart

Discovery Platform

NoSQL Platforms

Cloud-Based Platforms

Analytical Platforms

Hadoop

Requirements

EconomicsLoad

Structure ResponseComplexworkload