Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Improve Hadoop Economics, Performance,
and Security with Compression and
Encryption
Ravi Lambi
Director of Software Engineering
Data Compression and Security Business Unit
Exar Corporation
Santa Clara, CA USA
November 2014
1
The Storage IO BottleneckPerformance Gap
1
10
100
1000
10000
100000
1000000
10000000
100000000
Processor
Traditional Disk
Santa Clara, CA USA
November 2014 2
The Storage IO BottleneckCurrent Server Solution
More Disks and more rack space
This will increase management cost, and also require more expensive
storage controller. Additionally, there is a limit to scale the width – each
server has a hard physical limit.
Santa Clara, CA USA
November 2014 3
The Storage IO Bottleneck
It Is Difficult to Balance Performance, Capacity
Scaling and Cost Associated with the Storage IO
Summarized Challenge
Performance
Cost Capacity
Santa Clara, CA USA
November 2014 4
Ingest
Map
Compress/Distribute
Decompress/Compute
Reduce
Output
Compression
CodecDiskNetwork
Santa Clara, CA USA
November 2014 5
Compression TechnologyWhere In Hadoop To Apply Compression?
Exar’s Hadoop Acceleration - AltraHD
Ingest
Map
Compress/Distribute
Decompress/Compute
Reduce
Output
Compression
Codec
File System
Filter Driver
Driver
DiskNetwork
File System Filter
DriverCompression
Codec
5 GB/sec HW
Compression &
Encryption
Accelerator
Santa Clara, CA USA
November 2014 6
Offload Compression and Accelerate
AltraHD Overview
Storage Volume
Native File System
File System Filter
Driver
Driver
Applications
Native Linux Kernel
• File System Filter Driver
– Kernel plug-in at the file system layer
– Compresses/decompresses ALL files
independent of application
• Transparent to the Application
– No modification to Applications or
Workflow
• Seamlessly Layers over File System
– Supports EXT3, EXT4, or XFS
• Fast, Easy Deployment
– No APIs – Software installs in minutes
• Hardware Acceleration Offloads Host
CPU
Exar Compression
& Encryption
Acceleration Card
Core Technology
Santa Clara, CA USA
November 2014 7
MapReduce 1 Terasort Benchmark
93% 35% 27% 51% 21% 22%
0
200
400
600
800
1000
1200
1400
EXT3 8 Disk XFS 8 Disk EXT4 8 Disk EXT3 12 Disk XFS 12 Disk EXT4 12 Disk
SE
CO
ND
S
Native AltraHD % Improvement
Value Proposition – Performance
Santa Clara, CA USA
November 2014 8
Value Proposition - PerformanceMarReduce2 Job Execution Time
27% 34% 31% 18% 18% 24%
0
200
400
600
800
1000
1200
1400
1600
1800
2000
2200
EXT3 8 Disk XFS 8 Disk EXT4 8 Disk EXT3 12 Disk XFS 12 Disk EXT4 12 Disk
SE
CO
ND
S
Native AltraHD % Improvement
Santa Clara, CA USA
November 2014 9
Value Proposition – Storage
Santa Clara, CA USA
November 2014 10
Increased Storage Capacity
672
192
0 100 200 300 400 500 600 700
TERABYTES
MR2 Effective Storage Capacity
Native AltraHD
1344
192
0 300 600 900 1200 1500
TERABYTES
MR1 Effective Storage Capacity
Native AltraHD
Value Proposition – Storage
Santa Clara, CA USA
November 2014 11
Increased Storage Capacity
Native - Storage
AltraHD – Effective Storage
Value Proposition – Security
Exar’s Compression Acceleration Card Supports
Compression, Encryption, and Hashing in a Single
Pass
Aligned with Hadoop Security Roadmap
Santa Clara, CA USA
November 2014 12
Compression
Encryption
Hashing
Value Proposition – Indirect Values Other Savings
Reduce Indirect Costs:
• Power
• Rack Space
• Cooling
• Disk Life
• etc.
Santa Clara, CA USA
November 2014 13