Energy Efficient Prefetching – from Models to Implementation

Preview:

DESCRIPTION

With the rapid growth of the production and storage of large scale data sets it is important to investigate methods to drive the cost of storage systems down. We are currently in the midst of an information explosion and large scale storage centers are increasingly used to help store generated data. There are several methods to bring the cost of large scale storage centers down and we investigate a technique that focuses on transitioning storage disks into lower power states. This talk introduces a model of disk systems that leverages disk access patterns to prefetch popular sets of data to produce energy saving opportunities. Using the model we have developed a simulator that allows us to quickly change various parameters to investigate the relationship that file access patterns, disk energy parameters, and simulation parameters have on the overall energy efficiency of disk systems. To help improve the validity of our simulation results we leveraged the validated disk simulator, DiskSim, and added disk power models to DiskSim. This allowed us to test our energy efficient strategies with a validated storage system simulator. The last part of this talk focuses on implementing a large scale storage system virtual file system. We introduce the Energy Efficient Virtual File System, or EEVFS, to mange the data placement and disk states in a cluster storage system. Our modeling and simulation results indicated that large data sizes and knowledge about the disk access pattern are valuable for storage system energy savings techniques. Storage servers that support applications that stream media is one key area that would benefit from our strategies. The final idea introduced in this talk is the concept of parallel striping groups, which attempt to improve the performance of EEVFS while maintaining energy savings.

Citation preview

Energy Efficient Prefetching – from models to Implementation

04/11/23 1

Adam Manzanares and Xiao Qin

Department of Computer Science and Software Engineering

Auburn Universityhttp://www.eng.auburn.edu/~xqin

xqin@auburn.edu

Adam Manzanares

Ph.D. May 2010.

About me

Ph.D.’04, U. of Nebraska-Lincoln

04-07, New Mexico Tech 07-10, Auburn University

About My Research Group

Presentation Outline

• Motivation

• Modeling Work

• DiskSim Modifications

• Energy Efficient Virtual File System (EEVFS)

• Parallel Striping Groups in EEVFS

• Conclusion

04/11/23 5

MotivationEPA Report to Congress on Server and Data Center Energy Efficiency, 2007

04/11/23 6

Motivation Using 2010 Historical Trends Scenario

◦ Server and Data Centers Consume 110 Billion kWh per year

◦ Assume average commercial end user is charged 9.46 kWh

◦ Disk systems can account for 27% of the energy cost of data centers

04/11/23 7

Buffer Disk Architecture

RAM BufferRAM Buffer

m buffer disksm buffer disks n data disksn data disks

Buffer Disk ControllerBuffer Disk Controller

Data Partitioning Data Partitioning

Security Model Security Model

Load BalancingLoad Balancing

Power ManagementPower Management

PrefetchingPrefetching

Disk RequestsDisk Requests

Energy-Related Reliability Model Energy-Related Reliability Model

04/11/23 8

IBM Ultrastar 36Z15

04/11/23 9

Transfer Rate 55 MB/s Spin Down Time: TD 1.5 s

Active Power: PA 13.5 W Spin Up Time: TU 10.9 s

Idle Power: PI 10.2 W Spin Down Energy: ED 13 J

Standby Power: PA 2.5 W Spin Up Energy: EU 135 J

Break-Even Time: TBE 15.2 S

Prefetching

Disk 1

Disk 2

Disk 3

Buffer Disk

04/11/23 10

Why Modeling & Simulation

• Allows us to determine the potential of our research ideas

• Can quickly evaluate many simulation parameters

• Allows us to test architectures and hardware without having the physical resources

04/11/23 11

Modeling & Simulation Work

Developed Mathematical Model◦Disk Energy Consumption◦Conditions to prefetch

Developed Energy Saving Principles◦Investigated cases that exploit the energy

saving principles Implemented model in JAVA based simulator

04/11/23 12

Energy Saving Principles

Energy Saving Principle One◦Increase the length and number of idle

periods larger than the disk break-even time TBE

Energy Saving Principle Two◦Reduce the number of power-state

transitions

04/11/23 13

Paramaters TestedParameter Values

Data Size 1, 5, 10, 25 MB

# of Data Disks 4, 8, 12

Inter-arrival Delay 0, 0.1, 0.5, 1 S

Hit Rate 85, 90, 95, 100%

04/11/23 14

Energy Savings Hit Rate 85%

04/11/23 15

State Transitions

04/11/23 16

Parameter Generalizations

• Larger data sizes produce greater energy savings and less state transitions

• Increasing the inter-arrival delay increases energy savings

• More data disks per buffer disks increases energy efficiency

• High hit rates produce the greatest energy efficiency

04/11/23 17

Modeling & Sim. Summary

Hit Rate, Inter-arrival Delay, & Data Size combine to produce Idle Windows

Transitions important to reduce energy consumption◦ May increase/decrease to reduce energy consumption

Disk parameters have large impact on energy savings

Model and simulator developed in-house

04/11/23 18

DiskSim

• Event driven simulator developed at CMU

• Simulates disks at the block level

• The simulator has been validated

• Discrete event based simulator

• Provides a large amount of statistics

• Lacks Disk Power Models

• Ability to simulate large storage systems

04/11/23 19

File System Simulator

• Large files important to energy savings

• Popularity of data is also useful

• Developed a block to file translator

• Interacts with DiskSim

04/11/23 20

DiskSim with File System Simulator

04/11/23 21

Modified DiskSim Results

04/11/23 22

Modified DiskSim Summary

• Provides us with accurate disk statistics

• Only the changes to DiskSim need to be validated

• Heavily dependent upon disk parameters

• May miss details that can only be found in implementation

04/11/23 23

Why a Cluster File System

• Block level prefetching difficult

• Natural place to track file accesses

• Control placement of data among storage nodes, and data disks

• Tiered approach simplifies management of files and disk states

• Eliminates some shortcomings of modeling and simulation

04/11/23 24

Energy Efficient Virtual File System

04/11/23 25

EEVFS Process Flow

04/11/23 26

EEVFS Testbed

Parameter Storage Server Storage Node Type 1

Storage Node Type 2

CPU P4 2.0 GHz P4 3.2 GHz P4 2.4 GHz

Memory (MB) 2000 1000 512

Network Interconnect

1000 1000 100

Disk Type SATA ATA/133 ATA/133

Disk Capacity 120 GB 80 GB 80 GB

Disk Bandwidth 100 MB/s 58 MB/s 34 MB/s

04/11/23 27

Energy Savings

04/11/23 28

State Transitions

04/11/23 29

Response Times

04/11/23 30

Berkeley Web Trace

04/11/23 31

EEVFS Summary

• Knowledge of requests assumed and may be hard to come by

• Performance tied to one of the buffer disks

04/11/23 32

Parallel Striping Groups

Disk 1 Disk 2

Group 1

Buffer Disk

Storage Node 1

Disk 3 Disk 4Buffer Disk

Storage Node 2

Disk 5 Disk 6

Group 2

Buffer Disk

Storage Node 3

Disk 7 Disk 8Buffer Disk

Storage Node 4

File 1 File 2File 3 File 4

04/11/23 33

Striping Within a Group

Disk 1 Disk 2

Group 1

Buffer Disk

Storage Node 1

Disk 3 Disk 4Buffer Disk

Storage Node 2

1 3 5 7 9 4 6 8

4 6 81 3 5 7 9

10

10

1

2

1

2

File 1 File 22 2

04/11/23 34

Striping Within a Group

• Number of disks in a group can be matched to nearest bottleneck

• Striping within the group maintains relatively high performance

• Allows us to use a buffer disk for each storage node, while still maintaining file striping level

04/11/23 35

TestbedParameter Storage Server Storage Node

CPU Celeron 2.2 GHz Celeron 2.2 GHz

Memory (MB) 2000 2000

Network Interconnect

1000 1000

Disk Type SATA SATA

Disk Capacity 160 GB 480 GB

Disk Bandwidth 126 MB/s 126 MB/s

04/11/23 36

Measured Results

04/11/23 37

Measured Results

04/11/23 38

Berkeley Web Trace

04/11/23 39

Response Time Comparison

• Energy efficiency is slightly improved

• Response time gain is significant

Parameter Striping No Striping

Energy Consumption (J) 2,088,113 2,100,243

Response Time (S) 2.78 13.87

04/11/23 40

Parallel Striping Groups Summary

• Improves the energy efficiency and performance of a storage system

• Designed to scale– Needs to be tested on large scale storage

system

04/11/23 41

Conclusions

• Modeling and simulation used to test our ideas

– System, Disk, Trace Parameters varied to study their impacts

• DiskSim Modifications

– Added disk power models to DiskSim

– Implemented block to file translator

• Energy Aware Virtual Cluster File System (EEVFS)

– Implemented a prototype

– Added parallel striping groups to improve the energy efficiency

04/11/23 42

Future Work

• Improve the EEVFS prototype for production use

• Run EEVFS on large scale storage system– Investigate scaling effects

04/11/23 43

http://www.auburn.edu/~xzq0001

Download the presentation slides

Download the presentation slides

Download the presentation slides

Questions

Recommended