28
GPU DATA WAREHOUSE DATA ANALYTICS FOR MASSIVE

GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

GPU DATA WAREHOUSE

DATA ANALYTICSFOR MASSIVE

Page 2: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

OBJECTIVES• Introduction to SQream• The Data Analytics Scalability Problem• What is SQream DB• How SQream works with IBM• SQream Case Studies• Questions and Next Steps

Page 3: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

HQ in 7 WTC New York | R&D in Tel Aviv

CORPORATE PROFILEFOUNDED IN 2010

with Alibaba CloudStrategic Partnership

Patents10

Employees70+

Page 4: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

2008<1-4TB

2010<10TB

2016TB-PB

YOUR DATA STORES AREGROWING EXPONENTIALLY

TechnologyCPU

TechnologyGPU

Page 5: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

BUT YOUR DATABASE WASN’TBUILT TO HANDLE THIS LEVEL OF DATA

1970s-1990s 1990-2010MPP zone

2005-2010In-Memory zone

2010…Massive Data zoneClassic Relational zone

Page 6: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

SQL QUERIES AND BI ANALYTICS

ARE TAKING WAYTOO LONG

Page 7: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

VALUABLE INSIGHTSGO UNDISCOVERED

BI Lost90%

Data Analyzed<10%

Page 8: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

NEW PARADIGM REQUIRED TO HANDLE MASSIVE DATA

Page 9: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

SQREAM DBGPU-ACCELERATED DATA WAREHOUSE

100xfaster

Queries

10%of resources

Cost

20xmore data

Analyze

Page 10: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

SQREAM DB

• Massively parallel engine• Faster and smaller than CPUs

POWERED BY GPUs

• Terabytes to petabytes• Not limited by RAM

• Ingests 3 TB/hr/GPU• Powerful columnar storage• Always-on compression

• Familiar ANSI SQL• Standard connectors

• 100 TB in a 2U server• Highly cost-efficient

• Python, AI, Jupyter, etc. • Built for data science

COMPLEMENTS YOUR EXISTING DATA STORES

MASSIVELY SCALABLE

SQL DATABASE

EXTENSIBLE FOR ML/AI

MINIMAL FOOTPRINT

LIGHTNING FAST

Page 11: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

HOW IT WORKS

Chunking

Data Data Data

Automatic adaptivecompression

Data Data Data

GPU

Parallel chunkprocessing

Data Skipping

Data Data Data

Columnar process+ Metadata tagging

Data DataDataData

Raw data

Data Data Data

Data Data DataData Data Data

Page 12: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

YOUR ANALYTICSSIMPLIFY AND ACCELERATE

Aggregations,indexing, cubingMPP 100 TB

Analyticstake hours

Complex ETL

MPP 100 TB

Resultswithin minutes

Page 13: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

INTEGRATES WITHYOUR BI ECOSYSTEM

Java | Python | SQL | R | C++

- Data Sources -

- ETL and others -

- SQream DB-

Cloud Infrastructure

- BI and Visualization -

Page 14: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

SQREAM DB

• Independent scale of workload and data volume

GPUGPU

GPUGPU

GPUGPU

GPUGPU

GPUGPU

GPUGPU

GPUGPU

GPUGPU

• Hybrid cloud deployment• Simple and flexible software

management

DECOUPLED COMPUTE AND STORAGE BENEFITS

Page 15: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

SCALE-UP OR OUT• Scale up by expanding attached storage • Scale out by adding additional compute nodes

Page 16: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

FAST AND SIMPLEBIG DATA EXPLORATION Query raw data directly Immediate ad-hoc querying More dimensions, more joins More insights, better accuracy Enhanced business intelligence

Multiple JOINs on any field

Time Series

RegularExpressions

ANSI-92Compatible

Window Analysis

ODBC, JDBC Python

Connectivity

Page 17: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

HIGH THROUGHPUT CONVERGED• SQream DB designed for high-throughput

• IBM Power Systems is the only NVLinkCPU-to-GPU enabled architecture

• IBM AC922, with POWER9 and NVLINK can transfer data at up to 300GB/s, almost 9.5x faster than PCIe 3.0 found in x86-based architectures, reducing classic I/O bottlenecks

2xNVIDIA

Tesla V100

2xNVIDIA

Tesla V100

IBM Power 9

IBM Power 9

Page 18: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

UP TO 3.7x FASTER QUERIESSQREAM DB ON POWER9

• SQream DB on Power9 is between 150% to 370% faster than comparable x86 architectures

• The CPU-GPU NVLink bandwidth is key to performance in complex queries

IBM Power9 AC922:2x POWER9 16C @ 3.8GHz | 256 GB DDR4 2666 MHz | SSD storage | 4x NVIDIA Tesla V100 (SXM2 NVLINK - 16GB)Dell PowerEdge R740:2x Intel Xeon Silver 4112 CPU @ 2.60GHz | 256GB DDR4 2666MHz | SSD storage | 4x NVIDIA Tesla V100 (PCIe - 16GB)

52.83

10.35

84.578.57

14.06

2.8

30.29 29.01

0

10

20

30

40

50

60

70

80

90

TPC-H Query 8 TPC-H Query 6 TPC-H Query 19 TPC-H Query 17

Que

ry ti

me

(sec

onds

)Lo

wer

is b

ette

r

Query

SQream DB performanceIBM Power9 vs Intel Xeon (Skylake)

Dell PowerEdge R740 IBM Power9 AC922

Page 19: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

UP TO 2x FASTER LOADINGSQREAM DB ON POWER9

• SQream DB relies on CPU as well as GPUs for loading

• IBM’s Power9 multi-core architecture makes loading much faster than comparable x86 based systems

• IBM Power9 system loaded data nearly twice as fast as the x86 based machine

IBM Power9 AC922:2x POWER9 16C @ 3.8GHz | 256 GB DDR4 2666 MHz | SSD storage | 4x NVIDIA Tesla V100 (SXM2 NVLINK - 16GB)Dell PowerEdge R740:2x Intel Xeon Silver 4112 CPU @ 2.60GHz | 256GB DDR4 2666MHz | SSD storage | 4x NVIDIA Tesla V100 (PCIe - 16GB)

1,929

1,094

-

500

1,000

1,500

2,000

2,500

Load Time (seconds)

Load

Tim

e (s

econ

ds)

Low

er is

bet

ter

Load time for 6 billion TPC-H records

Dell Poweredge R740 IBM Power9 AC922

Page 20: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

FINANCEFraud analysis Risk consolidationCustomized services

RETAILMonitor CompetitorsCustomer Experience Operational Decisions

TELECOMCustomer 360Competitive AnalysisNetwork Optimization

HEALTHCARECare ManagementIOT DevicesGenomic Research

Page 21: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

UNDERSTAND 40 MILLION CUSTOMERSTELECOM

HP DL380g9with NVIDIA Tesla GPUs96 GB RAM + 6 TB storage

$200K

40 NODES5 full racks7600 CPU cores

$10,000,000

18M

10M

360M

120M

Ingest time

Reporting time

Ownership Cost

Page 22: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

INCREASE REVENUESAD-TECH

Tesla GPUs

AcquisitionSources

85 TB/day in ad impressions for constructing bidding histograms

Data

2x NVIDIA

Queries take5 hours

Extract

Data Ingest Queries take5 minutes

Page 23: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

Tesla GPUs

AcquisitionSources

Data

8x NVIDIA

ExtractNot feasible

X

Queries take5 minutes

INCREASE REVENUESAD-TECH

360 TB/day ingested to enhance bid histogram accuracy

Data Ingest

Page 24: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

BUSINESS INSIGHTSWHOLESALE

$30 Billion Company - Supply Chain Use Case

DISCOVER NEW

Query Time Reduced from 30 Minutes on Exadata to 30 Seconds on SQream

Vast insights

untapped datauncovered from

Page 25: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

SQream and Orange demonstrate 100x cost performance, removing limits of databases.”“Pascal Déchamboux | Director of Software

SQream helps us keep pace with rapidly increasing data for real customer benefits.”“Suppachai Panichayunon, Head Solution Architect

WHAT OURCUSTOMERS SAY

SQream is helping us to cut years of cancer research on large genomic datasets.”“Prof. Gideon Rechavi, Head of Cancer Research

We saw a cost effective opportunity to obtain analytic capabilities we couldn’t have before.““RF Group Leader

Page 26: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

PROOF OF VALUE

NDA

SQream Introduction

PoC Planand Scope

Use case in detail KPIs Schema and queries Data samples

Kickoff

HW and SW recommendations ETL process design

Internal Evaluation

Use case validation Load and test Query optimization

Page 27: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

Business Value Identification• What are your SQL analytics use cases?• What do you use now for database and visualization?• How much data do you perform analytics on now in terabytes? And how

much would you like to analyze now and in 1 to 3 years, in terabytes?• How much data is in your data lake and what is the rate of growth?• What kind of query response times do you get now and what are you

looking for?• What are the data sources and do you feel you analyze enough dimensions?• How many users? Total and concurrent?

Page 28: GPU DATA WAREHOUSE FOR MASSIVE€¦ · BUT YOUR DATABASE WASN’T. BUILT TO HANDLE THIS LEVEL OF DATA. 1970s-1990s. 1990-2010. MPP zone. 2005-2010. In-Memory zone. 2010… Classic

FEEL FREE TO

ADDRESS

Headquarters, 7 WTC 250 Greenwich Street New York, New York

Joel Sehr, VP Sales, [email protected] | www.sqream.com

WE ARE SOCIAL

CONTACT