View
214
Download
0
Embed Size (px)
Citation preview
1
The Case for Versatile Storage System
NetSysLabThe University of British Columbia
Samer Al-Kiswany, Abdullah Gharaibeh, Matei Ripeanu
2
Introduction
HotStorage ‘09
Versatile Storage System for large-scale platforms:
• Underutilized resources• Application specialization
The Deployment Approach: • Configured at deployment time• Coupled with the target application
Potential: Higher performance and scalability
3
Platform Example – Argonne Blue Gene/P
160K cores
10 Gb/s Switch
Complex
10 Gb/s Switch
Complex
GPFS
24 servers
IO rate : 8GBps = 51KBps / core !!
HotStorage ‘09
2.5K IO NodesT
oru
s Netw
ork
2.5 GBpsper node3D Torus
850 MBps per 64 nodes
TreeUnder utilized resources.
4
Workload Characteristics
HotStorage ‘09
Workflows – Execution stages communicating through intermediate temporary files
Source [Zhao et. al. SIGMOD record ‘05]
Input file
Output file
Compute
5
Workload Characteristics
HotStorage ‘09
Workflows – Execution stages communicating through intermediate temporary files
Tibi Stef-Praun, et. al. [e-Social Science ‘07]
6
Workload Characteristics
Workflows – Execution stages communicating through intermediate temporary files
HotStorage ‘09
Axes Optimizations
Data life time (temporary )
Application informed caching
Read (Seq. ) Read-ahead
Write (Seq. ) Asynch. write
Consistency (no ) Relaxed Consistency
Workflows
7
Workload Characteristics
Data Analysis – Analyze/search large data sets (e.g. BLAST)
HotStorage ‘09
BLASTMatch new sequences with a data set of known sequences (linear search)
Axes Optimizations
Data life time (temporary )
Application informed caching
Read (Seq. ) Read-ahead
Write (Seq. ) Asynch. write
Consistency (no )
Relaxed Consistency
Locality Caching
Workflows – Data Analysis
8
Workload Characteristics
Checkpointing
HotStorage ‘09
Axes Optimizations
Data life time (temporary )
Application informed caching
Read (Seq. ) Read-ahead
Write (Seq. ) Asynch. write
Consistency (no )
Relaxed Consistency
Locality Caching
Compressibility Similarity detectionWorkflows Data Analysis Checkpointing
9
Workload Characteristics
HotStorage ‘09
Workflows Data Analysis Checkpointing
Axes Optimizations
Data life time (temporary )
Application informed caching
Read (Seq. ) Read-ahead
Write (Seq. ) Asynch. write
Consistency (no )
Relaxed Consistency
Locality Caching
Compressibility Similarity detection
Security Tunable sec. levels
10
Opportunities
Specialization: Application specialized storage Under utilized resources
Compute node storage space Interconnect bandwidth
HotStorage ‘09
11
Our Solution
Versatile Storage System: Application specialized
The Deployment Approach: • Configured at deployment time• Life time coupled with the target application
Potential : Higher
performance and
scalability
HotStorage ‘09
12
Versatile Storage System Architecture
Manager(Metadata management)
HotStorage ‘09
Access Module
StorageNode
Compute Node
13
Configurable / Extensible IO Pipeline
HotStorage ‘09
Application
IO
Queue
DispatcherBuffer Manag. …ConsistencyMetadata
OperationsContent
AddressabilityData
SecurityCommunication
Agent
Application
IO
Queue
DispatcherBuffer Manag.
MetadataOperations
Access Module
StorageNode
14
Configurable / Extensible IO Pipeline
HotStorage ‘09
Application
IO
Queue
DispatcherBuffer Manag. …ConsistencyMetadata
OperationsContent
AddressabilityData
SecurityCommunication
Agent
Dispatcher …ConsistencyContent
AddressabilityData
SecurityCommunication
Agent
Access Module
StorageNode
15
Configurable / Extensible Support
HotStorage ‘09
Metadata Service API
DispatcherRequest
New Module Support
…
Application
IO
Queue
DispatcherBuffer Manag. …
MetadataOperations NM Communication
Agent
Access Module
StorageNode
Manager
Access Module
Header
Request data
16
Preliminary Evaluation – Real Application
HotStorage ‘09
DOCK6 workflow:
Overall: 1.52x
Stages
Read input, compute, and write temporary results
Summarize, sort, and select
Archive
Versatile Storage Optimizations
Cache the input data
Cache temporary files
Asynch. flush results to GPFS
Results (8K processors)
1.06x
11.76x
1.51x
17
Summary
HotStorage ‘09
Versatile Storage System• Underutilized resources• Application specialization
The Deployment Approach: • Configured at deployment time• Coupled with the target application
Potential: Higher performance and scalability
18
Not addressed – Future work
HotStorage ‘09
Configurability / extensibility evaluation Complete prototype Evaluation with a diverse set of applications
Configuration Application profiling File system automated configuration