Past, Present and Future of Storage Devices

1

Past, Present and Future of Storage Devices

Mircea R. Stan, Sudhanva Gurumurthi

University of Virginia

E-mail: [email protected]

2

Outline

• Revisiting the Past

• Living in the Present

• Predicting the Future

2

3

Outline

• Revisiting the Past

• Living in the Present

• Predicting the Future

Caveat: although a very general title, this is a biased account based on research at the University of Virginia

4

Why revisit the Past?

3

5


• The reports of HDDs’ demise have been greatly exaggerated!*

* with apologies to Mark Twain

6


Source: The green data center: Energy-efficient computing in the 21st century, Chapter 4, 2007.

Servers

48%

Storage

37%

Others

15%

80% of the storage power is consumed by the disk arrays

That doesn’t mean that there is no trouble!

4

7

Microprocessor industry answer to the power problem?

• Parallelism and multicore

• Can we do something similar for storage?

8

Inter-Disk vs. Intra-Disk Parallelism

Data

Arm

Power = 2*SPM

+ 2*VCM

Power = 1*SPM

+ 2*VCM

Ghost from the Past: IBM 3340 and Conner Chinook

5

9

Intra-Disk Parallelism Taxonomy

• Disk/Spindle [D]

• Arm Assembly [A]

• Surface [S]

• Head [H]

D = 4

A = 2

S = 2

H = 2

Conventional Disk: [D=1][A=1][S=1][H=1]

10

Impact of Data Consolidation

• What if we migrate data from multiple disks on to one high-capacity drive?

• Baseline Multiple-Disk Configuration (MD)

• Migrate data from MD to a High-Capacity Single Disk (HC-SD) drive– Modeled based on Seagate Barracuda ES (750 GB)

Workload Disks Capacity (GB) RPM Platters

Financial 24 19.07 10,000 4

Websearch 6 19.07 10,000 4

TPC-C 4 37.17 10,000 4

TPC-H 15 35.96 7,200 6

6

11

Multi-Actuator Drives

• Single-Arm Movement: HC-SD-SA(n)

• SPTF-based scheduling

– Closest idle arm services the request

• Peak power ~ conventional HDD

• Number of Actuators: n = 1, 2, 3, 4

Heads

per Arm

[H=1]

Surface

[S=1]

Disks

[D=1]

Arm Assemblies

per Disk [A=2]

12

HC-SD-SA(n) PerformanceWebsearch

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

5 10 20 40 60 90 120 150 200 200+

Response Time (ms)

CD

F

HC-SD MD HC-SD-SA(2) HC-SD-SA(3) HC-SD-SA(4)

7

13

0

10

20

30

40

50

60

70

80

90

100

HC-SD SA(2)/7200 SA(4)/7200 MD

Avera

ge P

ow

er

(Watts)

HC-SD-SA(n) Power ConsumptionWebsearch

14

0

50

100

150

200

250

Avera

ge P

ow

er (W

atts)

4-disk

s-HC-S

D

2-disk

s-SA

(2)

1-disk

-SA(4)

8-disk

s-HC-S

D

4-disk

s-SA

(2)

2-disk

-SA(4)

16-d

isks

-HC-S

D

8-disk

s-SA

(2)

4-disk

-SA(4)

Storage Array Power ComparisonIso-Performance Configurations

Iso-performance datapoints determined via simulation using synthetic workloads

Intra-disk parallel arrays consume 40%-60% less

power

8ms inter-arrival

time

4ms inter-arrival

time

1ms inter-arrival

time

8

15

More details: ISCA’08 paper

• http://www.cs.virginia.edu/~gurumurthi/papers/CS-2008-03.pdf

• “Intra-Disk Parallelism: An Idea Whose Time Has Come” - S Sankar, S Gurumurthi, MR Stan -Proceedings of the 2008 International Symposium on Computer Architecture

• Intra-disk parallelism can help!

– Can build high-performance, low-power enterprise storage systems

16

Living in the Present

• “We cannot change the cards we are dealt, just how we play the hand”*

*Randy Pausch, “The Last Lecture”

9

17

Living in the Present

• “We cannot change the cards we are dealt, just how we play the hand”

• So how do we optimize what we have?

18

Disk Power Management “Knobs”

• Turn off the SPM and spin down the disks– Not an attractive solution for data center storage systems [Barroso and Hölzle, IEEE Computer 2007]

• Dynamic RPM (DRPM) modulation– Control SPM voltage– Available in drives today

• Dynamic speed control of disk arm movement– Control VCM voltage– Available in drives today

• How do we optimally control these knobs at runtime?

10

19

Sensitivity-Based Optimization of Disk Architecture (SODA)

• Figures of Merit:Energy (E), Performance (D)

• Knobs: x, y

E

D

Optimality requires

balancing the ratio of

sensitivities with respect to

each knob

20

Using SODA at Design-Time

• Develop analytical models for figures of merit

– Energy and Performance

– Parameterize both device and workload characteristics

• Use workload statistics as input to the model

– E.g., seek times, idle time

• Run the optimization

8.2

SPMSPMSPMbnP ω⋅⋅=

6.4

2rCb

dSPM⋅⋅⋅= ρ

π

Spindle Motor Power Model

11

21

Optimization Example

• Knobs: Voltage of SPM and VCM

• All combinations of three different platter sizes: 1.8”, 2.6”, 3.3”; and 1, 2, and 3 platters/disk

• Can use SODA to calculate the Pareto-optimal points in the E-D space.

22

Openmail

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0.00E+00 1.00E-07 2.00E-07 3.00E-07 4.00E-07 5.00E-07

1/TP ( 1 / (bits / sec) )

E (J)

n=1 d=1.8 C=76G

n=2 d=1.8 C=152G

n=4 d=1.8 C=305G

n=1 d=2.6 C=159G

n=2 d=2.6 C=318G

n=4 d=2.6 C=637G

n=1 d=3.3 C=256G

n=2 d=3.3 C=512G

n=4 d=3.3 C=1026G

Pareto curve of 4-platter 1.8” HDD is much closer to the origin

Pareto-Optimal Points

12

23

Using SODA at Runtime

• Measure the “Potentials of Energy Reduction” (Θi) of each knob i– Ratio of percentage change in energy to a percentage change in delay using knob i at that instant

• Θi varies over time

– Need to measure Θi periodically

Performance Guarantee

24

Sensitivity Based Power Management (SBPM)

• Setting the Performance Guarantee

– Profile workload running on storage system for 100K requests without power management

– Do = average storage system response time during this period

• RT: Average Response Time over a sampling window

• UT, LT: Specify range of acceptable performance around Do

13

25

Dynamic RPM Control Policy (DRPM)

• Similarities to SBPM– Policy actions based on periodic response time measurements

– UT and LT thresholds to specify performance range

• Key Differences– RPM is the only knob

– All disks ramped up to highest RPM when performance drops below expectations

Array Controller

Disks

Maintain acceptable

performance

Reduce energy consumption

26

Performance

14

27

•DRPM shifts RPM levels too often

•RPM transition delays are high (~ seconds)

28

More details: TCOMP’08 paper

• “Sensitivity-Based Optimization of Disk Architecture” - S Sankar, Y Zhang, S Gurumurthi, MR Stan - IEEE Transactions on Computers, Aug 2008

• Can SODA be extended to SSDs?

– Flash: $3.58/GB

– HDD: 38¢/GB

• A hybrid SSD/HDD architecture

• What are the right “knobs”?

15

29

Predicting the Future

• “Prediction is very difficult, especially if it's about the future”*

*Niels Bohr

30

Predicting the Future

• “Prediction is very difficult, especially if it's about the future”

• One of the many contenders for the title of “universal memory”:

– MRAM

– PCM

– FeRAM

– OxRRAM, etc.

16

31

“Resistive” memories

• Two-terminal devices that can be in one of two “states”:– High Resistance (“1”)– Low Resistance (“0”)

• Our “bet”:– Flavor of magnetic RAM (MRAM) called Spin-Torque Transfer RAM (STT-RAM)

– DARPA funding in collaboration with Grandis

GMR Devices

• Giant Magneto-Resistance – GMR

• Already used in HDD heads

From R. A. Buhrman, “Spin Torque Effects in Magnetic Nanostructures”, SpintechIV, 2007

17

MRAM vs. STT-RAM

SONY

STT-RAM from Grandis

S. Ikeda, J.Hayakawa, Y.M. Lee, F. Matsukura, Y. Ohno, T.

Hanyu and H. Ohno, “Magnetic tunnel junctions for spintronic

memories and beyond”, IEEE Trans Elec. Dev. 54, 991 (2007).

Y. Huai, Z. Diao, Y.Ding, A. Panchula, S. Wang, Z. Li, D. Apalkov, X. Luo,

H. Nagai, A. Driskill-Smith, and E. Chen, “Spin Transfer Torque RAM (STT-

RAM) Technology”, 2007 Inter. Conf. Solid State Devices and Materials,

Tsukuba, 2007, pp. 742-743.

Grandis, Inc.

Courtesy of Yiming Huai, Grandis, from presentation by S. James Allen UC Santa Barbara

115 x 180 nm2

18

STT-RAM Projections

A. Driskill-Smith, Y. Huai, “STT-RAM – A New Spin on Universal Memory”, Future Fab, 23, 28

STT-RAM ProjectionsToggle

MRAM

(180 nm)

Toggle

MRAM

(90 nm)*

DRAM

(90 nm)+

SRAM

(90 nm)+

FLASH

(90 nm)+

FLASH

(32 nm)+

ST MRAM

(90 nm)*

ST

MRAM

(32 nm)*

cell size (mm2)

1.25 0.25 0.05 1.3 0.06 0.01 0.06 0.01

Read time 35 ns 10 ns 10 ns 1.1 ns10 - 50

ns10 - 50

ns10 ns 1 ns

Program time

5 ns 5 ns 10 ns 1.1 ns0.1-100

ms0.1-100

ms10 ns 1 ns

Program energy/bit

150 pJ 120 pJ5 pJ

Needs refresh

5 pJ30 – 120

nJ10 nJ 0.4 pJ 0.04 pJ

Endurance > 1015 > 1015 > 1015 > 1015

> 1015

read,

> 106

write

> 1015

read,

> 106

write

> 1015 >1015

Non-volatility

YES YES NO NO YES YES YES YES

Nick Rizzo, Freescale, from presentation by S. James Allen UC Santa Barbara

19

37

Ultimate density (2F2) crosspoint architecture

Ref.: G. Rose et al., “Design Approaches for Hybrid CMOS/Molecular Mem. based on Exp’l Dev. Data,” GLSVLSI, 2006

Vw

-Vw

0

1

0

1

R/W’ 0 1

OUT

0

1

CLK

-Vw

BitIn

0

1-Vw

Vw

0

1 GND

Vw

R/W’

CLK

BitIn

Column Decoder

Row Decoder

1

0

1

0

0 1

VDD

GND

BitIn

R/W’

A0' A

1'

Vw

-Vw

ROW0

1

0

1

0

VDD

GND

R/W’

A0

A1

ROW3

CLK

CLK

Row Decoder Schematic

GND

GND

Recap

• Revisited past, present with peeks into the future

• Based on specific research at the University of Virginia

• We are interested and eager to collaborate with industry partners

Email: [email protected]

38

Documents

Past, Present and Future of Storage Devices