Upload
dinhdien
View
236
Download
6
Embed Size (px)
Citation preview
© 2015 IBM Corporation
IBM PureData System for Analytics Overview
Chris Jackson
Technical Sales Specialist
© 2015 IBM Corporation2
Too complex an infrastructure
Too complicated to deploy
Too much tuning required
Too inefficient at analytics
Too many people needed to maintain
Too costly to operate
2
Traditional Data Warehouses
They do NOT meet the demands of advanced analytics on big data.
are just too complex
Too long to get answers
© 2015 IBM Corporation5
IBM PureData System for Analytics (Powered by Netezza technology)
The Simple Data Warehouse Appliance for Serious Analytics
What makes it different?
Speed - 10-100x faster than traditional custom systems1
Simplicity - minimal administration and tuning
Scalability - petabyte+ scale user data capacity
Smart - high performance, advanced analytics
1 Based on IBM customers' reported results. "Traditional custom systems" refers to systems that are not professionally pre-built, pre-tested and optimized. Individual results may vary.
Purpose-built analytics appliance
Integrated database, server and storage
Standard interfaces
Low total cost of ownership
© 2015 IBM Corporation6
Evolution of Netezza & PureData System for Analytics
World’s FirstData Warehouse
Appliance
World’s First100 TB DataWarehouse Appliance
World’s FirstPetabyte Data
Warehouse Appliance
World’s FirstAnalytic Data Warehouse Appliance
NPS®
8000 Series
TwinFin™ with i-Class™
Advanced Analytics
NPS®
10000 Series
TwinFin™
2003 2006 2009 2010 2012 2015
World’s Fastest and Greenest Analytical
Appliance
PureData System for AnalyticsN300x
PureData System for AnalyticsN200x
World’s First appliance with no cost encryption
© 2015 IBM Corporation7
What’s New in PureData System for Analytics N3001
Performance
Faster performance with upgraded CPUs with more
cores
New appliance models
New rack mountable, ultra lite appliance for midsize
businesses
New 8-rack, Petabyte capacity appliance
Security
Improved security with Self Encrypting Drives
Kerberos support
New Netezza Platform Software (NPS) 7.2
Faster load rates
Performance Portal enhancements
and more
© 2015 IBM Corporation8
Logical Data Warehouse Capabilities – Ready to go!PureData appliance with IBM Fluid Query and IBM BigInsights for Hadoop
Data Warehouse Appliance
Real-time AnalyticsInfoSphere Streams Developer Edition 2 non-production User licenses and includes additional select accelerators
Business Intelligence Cognos Business Intelligence, 5 Analytics User licenses, plus 1 Analytics Administrator license
Hadoop Data ServicesBigInsights for Apache Hadoop 5 virtual Servers in aggregate per customer number
Exceptional value
provided
Included with the PureData System for Analytics N3001
Data Integration & TransformationInfoSphere DataStage – 280 PVUs of DataStage engine, 2 DataStage Designer User licenses, and InfoSphere Data Click
IBM Fluid Query included with
NPS appliance software
© 2015 IBM Corporation9
IBM Fluid Query connects to Hadoop
Cross platform query & data movement
between PureData System for Analytics and Hadoop
Question
Answer
Unifying PureData System for Analytics with Hadoop
SQL Query
Data Movement
© 2015 IBM Corporation10
IBM Fluid Query extends your data warehouse to RDBMS* sources
Cross platform query from PureData System for
Analytics to dashDB,
DB2, Oracle, and PureData System for Analytics
Unifying PureData System for Analytics with Structured Databases
Question
Answer
*Relational Database Management System
SQL Query
© 2015 IBM Corporation11
Introducing PureData System for Analytics N3001-001
Bringing speed and simplicity to midsize organizations for big outcomes
• Rack mountable
• Production ready
• Full function appliance
• User data capacity 16 TB*
• High availability - All redundant
hardware, 4 disk spares, hot swap
power supply
• Self encrypting drives, Kerberos
support, LDAP/Active directory
Solution Highlights
*Assumes 4x compression
Simple
Same user experience as all PureData System for
Analytics appliances
• Full function Netezza Platform Software with IBM
Netezza Analytics
• Support tools and Netezza Performance Portal
• ODBC/JDBC/OLE-DB/SQL Driver integration
Load and go with no tuning or administration
Speed
10-100x faster than traditional custom systems1
Smart
Rich set of in database analytic functions
Protection of all data from unauthorized access
Includes starter kits for Big Data and Business Intelligence
Agile
Easily incorporated into the data center with simplified
installation into an existing rack
Affordable
Purchase or lease
1Based on IBM customers’ reported results. “Traditional custom systems” refers to systems that are not professionally pre-built,
pre-tested and optimized. Individual results may vary.
© 2015 IBM Corporation12
Introducing PureData System for Analytics N3001-0808-rack System
1.5 PB of user data capacity1
Hosts: 2x x3750M4 and 600 GB Self Encrypting Drives
Blades: 56x HS23 with 20 core IvyBridge processors
Storage: 96 EXP2524 disk enclosures with 24x 600 GB Self Encrypting Drives
1Assumes 4x compression
© 2015 IBM Corporation13
PureData System for Analytics Family
10-100x faster than
custom systems1
3.3x faster I/O scan
rate2
Load and go, no tuning
Designed to run
complex analytics in
minutes, not hours
Rich set of in-database
analytics
N2002 N3001-xxx
N3001-001
DB2 Analytics Accelerator for z/OS
(now with N3001)
1Based on IBM customers' reported results. "Traditional custom systems" refers to systems that are not professionally pre-built, pre-tested and optimized.
Individual results may vary.2Comparing N1001 scan rate of 145 TB/hour to N2002 scan rate of 478 TB/hour
…plus
Rack mountable
appliance
Ideal for small and
medium business with
up to 16 TB of user data
...plus
Entitled software capability for
real-time analytics, Hadoop
data services, data movement
and business intelligence
Advanced security
Partial rack to 8-rack
configurations
The hybrid computing platform
integrating Netezza technology with
zEnterprise technology
Supports transaction processing and
analytic workloads concurrently,
efficiently & cost effectively
Accelerates complex queries, up to
2000x faster
Required security compliance with
Data-at-Rest Encryption
© 2015 IBM Corporation14
Advanced analytics – the traditional way
Fraud
detection
Demand
forecasting
Analytical Tools
Analytics
grid
Data
warehouse
C/C++, Java, Python,
Fortran, …
Data
SQL
SQL
ETL
SQL
ETL
ETL
IBM PureData System for Analytics Data Warehouse Appliance
© 2015 IBM Corporation15
IBM PureData System for Analytics – Simplifying serious analytics
Fraud
detection
Demand
forecasting
Analytical Tools
Analytics
grid
Data
warehouse
Data
SQL
SQL
ETL
SQL
ETL
ETL
C/C++, Java, Python,
Fortran, …
SQL
IBM PureData System for Analytics Data Warehouse Appliance
© 2015 IBM Corporation16
The PureData System for Analytics AMPP Architecture
PureData System for Analytics Appliance
FPGA
Memory
CPU
FPGA
Memory
CPU
FPGA
Memory
CPU
S-Blades
Network
Fabric
Field Programmable Gate Array =
a blank canvas until it’s configured
Advanced
Analytics
Loaders
ETL
BI
Applications
Disk
Enclosures
“Lite”Host
(IBM xSeries,
Red Hat Linux)
© 2015 IBM Corporation17
S-Blade Data Stream Processing
Select State, Age, Gender, count(*) From MultiBillionRowCustomerTable Where BirthDate <
‘01/01/1960’ And State in (’FL’, ’GA’, ‘SC’, ‘NC’) Group by State, Age, Gender
Order by State, Age, Gender
FPGA Core CPU Core
Decompress ProjectRestrict
Visibility
SQL &
Advanced Analytics
From MultiBillionRowCustomerTableWhere BirthDate <‘01/01/1960’Group by State, Age, Gender
Select State, Age, Gender, count(*)
And State in (‘FL’, ‘GA’, ‘SC’, ‘NC’) Order by State, Age, Gender
From Select Where Group by
Stream via
Zone Map
From
© 2015 IBM Corporation18
1N3001-001 does not have Hardware Acceleration (FPGA)
Inside the IBM PureData System for Analytics N30011
Optimized Hardware +
Software
Hardware
accelerated AMPP
Purpose-built for
high performance
analytics
Requires no tuning
Snippet Blades ™
Hardware-based query
acceleration with FPGAs
Blistering fast results
Complex analytics
executed as the data
streams from disk
Disk Enclosures
User data, mirror,
swap partitions
High speed data
streaming
SMP Hosts
SQL Compiler
Query Plan
Optimize
Admin
© 2015 IBM Corporation19
Hardware Overview: Model N3001-010
User Data Capacity: 192 TB1
Data Scan Speed: 478 TB/hr* Load Speed: 10 TB/hr
Power Requirements: 7.5 kW Cooling Requirements: 27,000 BTU/hr
1Assuming 4X compression
Scales up to 8 full Racks
Terabyte to Petabyte+ Capacity
Up to 10TB/hr load rate in multi-rack
configurations
2 Hosts (Active-Passive) 2 Intel 10 Core Ivy Bridge CPUs 5X600 GB SAS Self Encrypting Drives Red Hat Linux 6 64-bit
7 PureData for Analytics S-Blades™ 2 Intel 10 Core Ivy Bridge CPUs 2 8-Engine Xilinx Virtex-6 FPGAs 128 GB RAM + 8 GB slice buffer Linux 64-bit Kernel
12 Disk Enclosures Total 288 600 GB SAS2 Self Encrypting Drives
• 240 for User Data• 14 for S-Blades• 34 Spare
RAID 1 Mirroring
© 2015 IBM Corporation20
The analytic enterprise
BI reporting and ad hoc analysis
• What happened?• When and where?• How much?
Predictive analytics
• What will happen?• What will the impact be?
Optimization
• What is thebest choice?
IBM PureData System for Analytics Data Warehouse Appliance
© 2015 IBM Corporation21
New extensions for ESRI, Spatial and Open Source R
ESRI
“R”
Spatial
IBM Netezza Analytics v3.2
© 2015 IBM Corporation22
IBM Netezza AnalyticsIn-database Analytics For Every Role in Your Enterprise
Bring the analytics to the data
not the data to the analytics
Included
Use cases
Features
Built-in, in-database analytic functions
- Data mining, prediction, transformations, statistics, geospatial, data preparation
Full integration with tools for BI & visualization
- IBM Cognos, Microstrategy, Business Objects, SAS, MS Excel, SSRS, Kognitio, Qlikview
Full integration with tools for model building & scoring
- IBM SPSS, SAS, Open Source R, Fuzzy Logix
Full integration for custom analytics
- Open Source R, Java, C, C++, Python, LUA
Reduce hospital admissions or personalize disease treatments
Achieve an order of magnitude improvement in manufacturing quality
Better understand the risk of catastrophic events
…and many more
Data
Preparation
Predictive
Analytics
Geospatial
Analytics
Advanced
Statistics