32
1 Copyright 2011 EMC Corporation. All rights reserved. Greenplum Extracting Value from your Data Peter Cooper APJ Technology Team

Track 3, session 4, implementing a unified analytics platform to become a data driven business peter cooper, apj, technology

Embed Size (px)

Citation preview

Page 1: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

1© Copyright 2011 EMC Corporation. All rights reserved.

GreenplumExtracting Value from your Data

Peter CooperAPJ Technology Team

Page 2: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

2© Copyright 2011 EMC Corporation. All rights reserved.

IN THIS DECADE THE DIGITAL UNIVERSE

WILL GROW 44X

FROM 0.9 ZETTABYTES TO 35.2 ZETTABYTES

Source : 2010 IDC Digital Universe Study

Page 3: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

3© Copyright 2011 EMC Corporation. All rights reserved.

IN THIS DECADE THE DIGITAL UNIVERSE

WILL GROW 44X

FROM 0.9 ZETTABYTES TO 35.2 ZETTABYTES

Source : 2010 IDC Digital Universe Study

Page 4: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

4© Copyright 2011 EMC Corporation. All rights reserved. Source: McKinsey Global Institute 2011

Page 5: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

5© Copyright 2011 EMC Corporation. All rights reserved.

Big Data will improve business performance

Through 2015, organisations integrating high-value, diverse, new information types and sources into a coherent information management infrastructure will outperform their industry peers financially by more than 20%* Gartner, Merv Adrian

Page 6: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

6© Copyright 2011 EMC Corporation. All rights reserved.

• Volume: data volumes approaching multiple petabytes• Velocity: data being generated and ingested for analysis in real-time• Variety: tabular, documents, e-mail, metering, network, video, image, audio• Complexity: different standards, domain rules, and storage formats per data

type

Big Data is more than Size

Transactional DataDocuments Smart Grid

Variety Complexity

Velocity Volume

Source: Gartner, March 2011

New insights on customers, products, and operations

Contextual and location-aware delivery to any device

Images Audio VideoText

Page 7: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

7© Copyright 2011 EMC Corporation. All rights reserved.

Increase Revenue With GreenplumBig Data Analytics Increases Per Customer Profit For Retail Banking Firm

LOW

HIGH

Agent “BestGuess”

Cus

tom

er P

rofit

Branch Level Reporting Enabling

Profit-basedRecommendations

LegacySystem

TRADITIONAL DATA LEVERAGED BIG DATA LEVERAGED

Greenplum Big Data Analytics

Greenplum DatabaseBusiness Intelligence

Reporting

Market Basket Analysis &Buyer Associations Enabling

User-basedRecommendations

Greenplum In-Database

Analytics

Data Enriched with Unstructured Activity Logs To Identify At Risk

Customers

Page 8: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

8© Copyright 2011 EMC Corporation. All rights reserved.

Optimize Marketing Campaigns With GreenplumBig Data Analytics Improves Customer Interactions For Credit Card Company

LOW

HIGH

Referring

URL

Only

Like

lihoo

d O

f C

onve

rsio

n

Mapping C

licks

To U

sers

Twitter S

entiment

User C

lusterin

g

User T

o

Funnel

Conversi

on

LegacySystem

TRADITIONAL DATA LEVERAGED BIG DATA LEVERAGED

Greenplum In-Database

Analytics

Off-W

ebsite

Behavio

ur

Facebook F

-of-F

Optimiza

tion

Blog and P

ress

Sentiment

YouTube

& Podca

sts

Clicks become userstargeted to predicted

outcomes

Greenplum Big Data Analytics

Social Media, Blog and Press,& Competitor Website Behavior, Leveraged to Refine Predictions

Page 9: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

9© Copyright 2011 EMC Corporation. All rights reserved.

Big Data Enabled Financial Services

“We are using Greenplum to process data into surveillance

patterns, the analytics of issues that are happening of

regulatory interest.”

MARTIN COLBURN, CTO

“The ability to process big amounts of information on a

near realtime basis has greatly

altered the value of data pulled off by NYSE Euronext in

the U.S.”

STEVE HIRSCH, CHIEF DATA OFFICER

“Greenplum offers strong scalability advantages due toits highly parallel model that

enables us to simply add more servers as data volumes

expand.”

ANNA EWING, CIO

Page 10: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

10

© Copyright 2011 EMC Corporation. All rights reserved.

Greenplum Global Customer

Telecom Media & Entertainment Analyze user behavior to eliminate network abuses

Retail Direct marketing/CRM

FinancialServices

Detect and prevent fraud and credit scoring and analysis to reduce credit risk

Pharmaceutical Analytics for drug discovery and development

InternetClickstream analytics for ad targeting and market research

Page 11: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

11

© Copyright 2011 EMC Corporation. All rights reserved.

Greenplum APJ Customers

Japan ~1 PB database for CDR analysis and scenario testing

IndiaCellular network performance analysis and reporting (~40TB database)

ThailandEnterprise data warehouse for customer analytics – replacing Teradata and Oracle systems (~70TB)

China Alipay data warehouse ~1PB

Australia Internal audit of user activities

Page 12: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

12

© Copyright 2011 EMC Corporation. All rights reserved.

`

No Value Available

Page 13: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

13

© Copyright 2011 EMC Corporation. All rights reserved.

Today’s Reality for most Companies

Shadowsystems ‘Shallow’

BusinessIntelligence

Static schemasaccrete over time

Non-standard,in-memoryanalytics

Slow-movingmodels

Slow-moving

data

Departmental warehouses

Sources

Page 14: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

14

© Copyright 2011 EMC Corporation. All rights reserved.

“Over the last 25 years, companies have been focused on leveraging maybe 5% of the information available to them… In order to compete well, companies are looking to dip into the rest of the 95% that can make them better than anyone else.”

Uncovering the value.

Source: Forrester Research Inc.

Less than 10% of available enterprise data Vast majority of available data, including external sources

“Rearview mirror” reports, dashboards, and analysis

“Forward looking” predictions with recommendations

Weeks, months, or even quarters old Real-time or near real-time

Incomplete, “over-processed”, regulatory data Raw, unstructured, “statistically correct” data

Architectures and methods that take 6 to 18 months to exploit

Vastly accelerated time to market

Today’s Situation Big Data Analytics Ramifications

Page 15: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

15

© Copyright 2011 EMC Corporation. All rights reserved.

Clicker Question:

Do you think that your current environment is capable of extracting the full value of the data available?

• Yes

• No

Page 16: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

16

© Copyright 2011 EMC Corporation. All rights reserved.

Databases Need to Adapt to Big DataThe Data Warehouse Institute (TDWI)

• 50% of TDWI survey respondents will replace their DW platform in the next 3 years because:

• E

Source: TDWI Next Gen Database Study, 2010

Cannot doadvanced analysis

Cannot handlebig datavolumes

Poor query response

Can’t support advanced analyticsInadequate data load speed

Can’t scale up to large date volumes

Cost of scaling up is too expensive

Poorly suited to real-time or on-demand workloads

45%

40%

39%

37%

33%

29%

Page 17: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

17

© Copyright 2011 EMC Corporation. All rights reserved.

Data InputData Input IntegrationIntegration Data Stores

and AccessData Stores and Access

Data Analysis

Data Analysis

Presentation & Delivery

Presentation & Delivery

Multimedia

Web/Social

ERP

CRM

POS

Data Sources

Mobile

Documents

Machine

DataQuality

MDM

ETL

Enterprise Data

Warehouse

BU 1

BU 2

BU 3

Data

Mart

s

Map

-R

ed

uce

Key Values Documents Other NoSql

Ecosystem* HDFS

Hadoop

NoSQL Stores

FederatedData

Warehouse

Map-Reduce

BI as a Service

Sta

tisticsS

tatistics

Data

Min

ing

Data

Min

ing

Op

era

tions R

ese

arch

Op

era

tions R

ese

arch

Neura

l Nets

Neura

l Nets

Genetic A

lgorith

ms

Genetic A

lgorith

ms

OLA

PO

LA

P

Alerts

Reports

Dashboards

Spreadsheets

Structureddata sources

Traditional dataIntegration

Traditional datawarehousing

Big data analytics ramifications

SQL Stores

LOB data

Big Data Analytics Reference Architecture

Mobile

Data Visualization

Page 18: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

18

© Copyright 2011 EMC Corporation. All rights reserved.

The Unified Analytics PlatformGreenplum Chorus

Analytic Productivity & Tool Integration

Data Computing InterfacesSQL, MapReduce, In-Database Analytics, Parallel Data Loading (batch or real-time)

Greenplum Database Greenplum Hadoop

Compute& StorageStorage

SQL DBEngine

Compute

Storage

MapReduceEngine

paralleldata exchange

Network

All Data Types

paralleldata exchange

Page 19: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

19

© Copyright 2011 EMC Corporation. All rights reserved.

Platform Independence

Delivers Choice and Flexibility

Software-Only• On your x86 hardware• Flexibility for any workload

Virtualized Infrastructure• Pool resources• Elastic scalability

Data Computing Appliance• Optimized Price/Performance• Minimum time-to-value• Ideal for Production Environments

Page 20: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

20

© Copyright 2011 EMC Corporation. All rights reserved.

"It's challenging finding customers out there doing big data analytics because building projects that can handle big data requires huge amounts of cash,"SAP

"At the moment the hype is ahead of business drivers"Teradata

Some people just don’t get it!

* http://www.v3.co.uk/v3-uk/news/2123281/analytics-challenges-mainstream-adoption

Page 21: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

21

© Copyright 2011 EMC Corporation. All rights reserved.

You Should Know About Big Data

Top 5 Things

Page 22: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

22

© Copyright 2011 EMC Corporation. All rights reserved.

5.Big Data does not

eliminate leadership errors.

GFC

Page 23: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

23

© Copyright 2011 EMC Corporation. All rights reserved.

4.Big Data means

you can andshould leverage

social data.

Page 24: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

24

© Copyright 2011 EMC Corporation. All rights reserved.

3.Big Data

requires newtools and

technology.

Page 25: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

25

© Copyright 2011 EMC Corporation. All rights reserved.

2.Big Data

requires new skills in yourworkforce.

Page 26: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

26

© Copyright 2011 EMC Corporation. All rights reserved.

1.Big Data

is bigger than“Cloud”.

Page 27: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

27

© Copyright 2011 EMC Corporation. All rights reserved.

You Should Take On Your Journey To Big Data Analytics

Top 3 Steps

Page 28: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

28

© Copyright 2011 EMC Corporation. All rights reserved.

1.Put all

your datato work.

Page 29: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

29

© Copyright 2011 EMC Corporation. All rights reserved.

2.Have a data

strategy.Model less, iterate more.

Page 30: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

30

© Copyright 2011 EMC Corporation. All rights reserved.

3.Invest in people,

technology,and your owncommitment.

Page 31: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

31

© Copyright 2011 EMC Corporation. All rights reserved.

“Luck is what happens when preparation meets opportunity.”

Seneca - Roman philosopher, mid-1st century AD)

Page 32: Track 3, session 4, implementing a unified analytics platform to become a data driven business  peter cooper, apj, technology

32

© Copyright 2011 EMC Corporation. All rights reserved.