Upload
emc-forum-india
View
914
Download
2
Tags:
Embed Size (px)
Citation preview
1© Copyright 2011 EMC Corporation. All rights reserved.
GreenplumExtracting Value from your Data
Peter CooperAPJ Technology Team
2© Copyright 2011 EMC Corporation. All rights reserved.
IN THIS DECADE THE DIGITAL UNIVERSE
WILL GROW 44X
FROM 0.9 ZETTABYTES TO 35.2 ZETTABYTES
Source : 2010 IDC Digital Universe Study
3© Copyright 2011 EMC Corporation. All rights reserved.
IN THIS DECADE THE DIGITAL UNIVERSE
WILL GROW 44X
FROM 0.9 ZETTABYTES TO 35.2 ZETTABYTES
Source : 2010 IDC Digital Universe Study
4© Copyright 2011 EMC Corporation. All rights reserved. Source: McKinsey Global Institute 2011
5© Copyright 2011 EMC Corporation. All rights reserved.
Big Data will improve business performance
Through 2015, organisations integrating high-value, diverse, new information types and sources into a coherent information management infrastructure will outperform their industry peers financially by more than 20%* Gartner, Merv Adrian
6© Copyright 2011 EMC Corporation. All rights reserved.
• Volume: data volumes approaching multiple petabytes• Velocity: data being generated and ingested for analysis in real-time• Variety: tabular, documents, e-mail, metering, network, video, image, audio• Complexity: different standards, domain rules, and storage formats per data
type
Big Data is more than Size
Transactional DataDocuments Smart Grid
Variety Complexity
Velocity Volume
Source: Gartner, March 2011
New insights on customers, products, and operations
Contextual and location-aware delivery to any device
Images Audio VideoText
7© Copyright 2011 EMC Corporation. All rights reserved.
Increase Revenue With GreenplumBig Data Analytics Increases Per Customer Profit For Retail Banking Firm
LOW
HIGH
Agent “BestGuess”
Cus
tom
er P
rofit
Branch Level Reporting Enabling
Profit-basedRecommendations
LegacySystem
TRADITIONAL DATA LEVERAGED BIG DATA LEVERAGED
Greenplum Big Data Analytics
Greenplum DatabaseBusiness Intelligence
Reporting
Market Basket Analysis &Buyer Associations Enabling
User-basedRecommendations
Greenplum In-Database
Analytics
Data Enriched with Unstructured Activity Logs To Identify At Risk
Customers
8© Copyright 2011 EMC Corporation. All rights reserved.
Optimize Marketing Campaigns With GreenplumBig Data Analytics Improves Customer Interactions For Credit Card Company
LOW
HIGH
Referring
URL
Only
Like
lihoo
d O
f C
onve
rsio
n
Mapping C
licks
To U
sers
Twitter S
entiment
User C
lusterin
g
User T
o
Funnel
Conversi
on
LegacySystem
TRADITIONAL DATA LEVERAGED BIG DATA LEVERAGED
Greenplum In-Database
Analytics
Off-W
ebsite
Behavio
ur
Facebook F
-of-F
Optimiza
tion
Blog and P
ress
Sentiment
YouTube
& Podca
sts
Clicks become userstargeted to predicted
outcomes
Greenplum Big Data Analytics
Social Media, Blog and Press,& Competitor Website Behavior, Leveraged to Refine Predictions
9© Copyright 2011 EMC Corporation. All rights reserved.
Big Data Enabled Financial Services
“We are using Greenplum to process data into surveillance
patterns, the analytics of issues that are happening of
regulatory interest.”
MARTIN COLBURN, CTO
“The ability to process big amounts of information on a
near realtime basis has greatly
altered the value of data pulled off by NYSE Euronext in
the U.S.”
STEVE HIRSCH, CHIEF DATA OFFICER
“Greenplum offers strong scalability advantages due toits highly parallel model that
enables us to simply add more servers as data volumes
expand.”
ANNA EWING, CIO
10
© Copyright 2011 EMC Corporation. All rights reserved.
Greenplum Global Customer
Telecom Media & Entertainment Analyze user behavior to eliminate network abuses
Retail Direct marketing/CRM
FinancialServices
Detect and prevent fraud and credit scoring and analysis to reduce credit risk
Pharmaceutical Analytics for drug discovery and development
InternetClickstream analytics for ad targeting and market research
11
© Copyright 2011 EMC Corporation. All rights reserved.
Greenplum APJ Customers
Japan ~1 PB database for CDR analysis and scenario testing
IndiaCellular network performance analysis and reporting (~40TB database)
ThailandEnterprise data warehouse for customer analytics – replacing Teradata and Oracle systems (~70TB)
China Alipay data warehouse ~1PB
Australia Internal audit of user activities
12
© Copyright 2011 EMC Corporation. All rights reserved.
`
No Value Available
13
© Copyright 2011 EMC Corporation. All rights reserved.
Today’s Reality for most Companies
Shadowsystems ‘Shallow’
BusinessIntelligence
Static schemasaccrete over time
Non-standard,in-memoryanalytics
Slow-movingmodels
Slow-moving
data
Departmental warehouses
Sources
14
© Copyright 2011 EMC Corporation. All rights reserved.
“Over the last 25 years, companies have been focused on leveraging maybe 5% of the information available to them… In order to compete well, companies are looking to dip into the rest of the 95% that can make them better than anyone else.”
Uncovering the value.
Source: Forrester Research Inc.
Less than 10% of available enterprise data Vast majority of available data, including external sources
“Rearview mirror” reports, dashboards, and analysis
“Forward looking” predictions with recommendations
Weeks, months, or even quarters old Real-time or near real-time
Incomplete, “over-processed”, regulatory data Raw, unstructured, “statistically correct” data
Architectures and methods that take 6 to 18 months to exploit
Vastly accelerated time to market
Today’s Situation Big Data Analytics Ramifications
15
© Copyright 2011 EMC Corporation. All rights reserved.
Clicker Question:
Do you think that your current environment is capable of extracting the full value of the data available?
• Yes
• No
16
© Copyright 2011 EMC Corporation. All rights reserved.
Databases Need to Adapt to Big DataThe Data Warehouse Institute (TDWI)
• 50% of TDWI survey respondents will replace their DW platform in the next 3 years because:
• E
Source: TDWI Next Gen Database Study, 2010
Cannot doadvanced analysis
Cannot handlebig datavolumes
Poor query response
Can’t support advanced analyticsInadequate data load speed
Can’t scale up to large date volumes
Cost of scaling up is too expensive
Poorly suited to real-time or on-demand workloads
45%
40%
39%
37%
33%
29%
17
© Copyright 2011 EMC Corporation. All rights reserved.
Data InputData Input IntegrationIntegration Data Stores
and AccessData Stores and Access
Data Analysis
Data Analysis
Presentation & Delivery
Presentation & Delivery
Multimedia
Web/Social
ERP
CRM
POS
Data Sources
Mobile
Documents
Machine
DataQuality
MDM
ETL
Enterprise Data
Warehouse
BU 1
BU 2
BU 3
Data
Mart
s
Map
-R
ed
uce
Key Values Documents Other NoSql
Ecosystem* HDFS
Hadoop
NoSQL Stores
FederatedData
Warehouse
Map-Reduce
BI as a Service
Sta
tisticsS
tatistics
Data
Min
ing
Data
Min
ing
Op
era
tions R
ese
arch
Op
era
tions R
ese
arch
Neura
l Nets
Neura
l Nets
Genetic A
lgorith
ms
Genetic A
lgorith
ms
OLA
PO
LA
P
Alerts
Reports
Dashboards
Spreadsheets
Structureddata sources
Traditional dataIntegration
Traditional datawarehousing
Big data analytics ramifications
SQL Stores
LOB data
Big Data Analytics Reference Architecture
Mobile
Data Visualization
18
© Copyright 2011 EMC Corporation. All rights reserved.
The Unified Analytics PlatformGreenplum Chorus
Analytic Productivity & Tool Integration
Data Computing InterfacesSQL, MapReduce, In-Database Analytics, Parallel Data Loading (batch or real-time)
Greenplum Database Greenplum Hadoop
Compute& StorageStorage
SQL DBEngine
Compute
Storage
MapReduceEngine
paralleldata exchange
Network
All Data Types
paralleldata exchange
19
© Copyright 2011 EMC Corporation. All rights reserved.
Platform Independence
Delivers Choice and Flexibility
Software-Only• On your x86 hardware• Flexibility for any workload
Virtualized Infrastructure• Pool resources• Elastic scalability
Data Computing Appliance• Optimized Price/Performance• Minimum time-to-value• Ideal for Production Environments
20
© Copyright 2011 EMC Corporation. All rights reserved.
"It's challenging finding customers out there doing big data analytics because building projects that can handle big data requires huge amounts of cash,"SAP
"At the moment the hype is ahead of business drivers"Teradata
Some people just don’t get it!
* http://www.v3.co.uk/v3-uk/news/2123281/analytics-challenges-mainstream-adoption
21
© Copyright 2011 EMC Corporation. All rights reserved.
You Should Know About Big Data
Top 5 Things
22
© Copyright 2011 EMC Corporation. All rights reserved.
5.Big Data does not
eliminate leadership errors.
GFC
23
© Copyright 2011 EMC Corporation. All rights reserved.
4.Big Data means
you can andshould leverage
social data.
24
© Copyright 2011 EMC Corporation. All rights reserved.
3.Big Data
requires newtools and
technology.
25
© Copyright 2011 EMC Corporation. All rights reserved.
2.Big Data
requires new skills in yourworkforce.
26
© Copyright 2011 EMC Corporation. All rights reserved.
1.Big Data
is bigger than“Cloud”.
27
© Copyright 2011 EMC Corporation. All rights reserved.
You Should Take On Your Journey To Big Data Analytics
Top 3 Steps
28
© Copyright 2011 EMC Corporation. All rights reserved.
1.Put all
your datato work.
29
© Copyright 2011 EMC Corporation. All rights reserved.
2.Have a data
strategy.Model less, iterate more.
30
© Copyright 2011 EMC Corporation. All rights reserved.
3.Invest in people,
technology,and your owncommitment.
31
© Copyright 2011 EMC Corporation. All rights reserved.
“Luck is what happens when preparation meets opportunity.”
Seneca - Roman philosopher, mid-1st century AD)
32
© Copyright 2011 EMC Corporation. All rights reserved.