16
Data Storage Systems: A Survey Abdullah Aldhamin July 29, 2013 CMPT 880: Large-Scale Multimedia Systems and Cloud Computing Course Project

Data Storage Systems: A Survey

Embed Size (px)

DESCRIPTION

Data Storage Systems: A Survey. Abdullah Aldhamin July 29, 2013. CMPT 880: Large-Scale Multimedia Systems and Cloud Computing Course Project. Motivation. Research interest in storage systems, specifically in SSDs. Outline. Objective Overview Solid-State Drives Use Cases. Objective. - PowerPoint PPT Presentation

Citation preview

Page 1: Data Storage Systems: A Survey

Data Storage Systems: A Survey

Abdullah AldhaminJuly 29, 2013

CMPT 880: Large-Scale Multimedia Systems and Cloud ComputingCourse Project

Page 2: Data Storage Systems: A Survey

Motivation

• Research interest in storage systems, specifically in SSDs

Page 3: Data Storage Systems: A Survey

Outline

• Objective• Overview• Solid-State Drives Use Cases

Page 4: Data Storage Systems: A Survey

Objective

• Storage system architectures in enterprise data centers.

• What is cloud storage?• Integrating flash-based solid-state drives in

large-scale storage systems.

Page 5: Data Storage Systems: A Survey

Overview

• Different storage architectures in data centers:– Block I/O interface (DAS and SAN)– File I/O interface (NAS)– Is there a “better” solution?– Shortcomings for today’s computing…

Page 6: Data Storage Systems: A Survey

Overview

• Cloud Storage:– What is cloud storage?• Object-based storage

– Example: Windows Azure Storage (WAS)– Some research problems

Page 7: Data Storage Systems: A Survey

Overview

• Flash-based solid state drive:– What is it?– Pros and Cons.– How can we integrate it in large-scale storage

systems?• Future direction

Page 8: Data Storage Systems: A Survey

Integrating SSDs in Large-Scale Storage Systems

• Considerations and Facts:– Non-uniform read access latencies, correlated with

workload access pattern– Internal drive-specific operations impacts the

performance– Internal fragmentation leads to performance

degradation– More writes write amplification bad wear

leveling– Performance vs. Lifetime vs. Cost

Page 9: Data Storage Systems: A Survey

Integrating SSDs in Large-Scale Storage Systems

SSD in Storage System

SSD-Only System

Hybrid System

End-Point Accelerator

Write Buffer Read Cache

Page 10: Data Storage Systems: A Survey

Gordon: SSD-only HPC Cluster

• The first HPC cluster designed with SSD-only storage

• Optimized to utilize SSDs high bandwidth, for data-intensive applications

Page 11: Data Storage Systems: A Survey

Gordon … (Cont’d)

• Design goals:– Reduce performance gap between processor and

I/O in large-scale data-intensive computing– Improve the system performance– Less power

Page 12: Data Storage Systems: A Survey

Gordon… (Cont’d)

• How SSD is integrated?– Replaced conventional hard disks with SSDs– Major device-level modification: New flash

translation layer

Page 13: Data Storage Systems: A Survey

Gordon… (Cont’d)

• Costly $$$• Not suitable for widespread adoption• Requires major device level modification– Optimized for specific workloads

Page 14: Data Storage Systems: A Survey

Hybrid: Griffin

• Griffin hybrid storage system– SSD is an end-point store for the data– Uses HDDs as write-back buffers• Log-structured HDDs to buffer incoming writes• Extends SSD lifetime

Page 15: Data Storage Systems: A Survey

Hybrid: Hystor

• Hybrid storage system– SSD is used to improve I/O performance• Read cache• Write-back buffer

– Challenge:• What data to should be cached to gain from SSD

performance and improve performance?• Minimum system changes.

Page 16: Data Storage Systems: A Survey

Conclusion

• The choice of I/O interface allows for different storage access features

• Cloud storage continues to grow to accommodate for the overflowing of data collected

• Solid state drive has become an instrumental player in storage systems: but how can we best use it?