10
CAECW 2008 -- Salt Lake C ity -- Veazey & Gaither Varying Memory Size with TPC-C Performance and Resource Effects Jay Veazey and Blaine Gaither Hewlett-Packard [email protected] [email protected]

CAECW 2008 -- Salt Lake City -- Veazey & Gaither Varying Memory Size with TPC-C Performance and Resource Effects Jay Veazey and Blaine Gaither Hewlett-Packard

Embed Size (px)

Citation preview

Page 1: CAECW 2008 -- Salt Lake City -- Veazey & Gaither Varying Memory Size with TPC-C Performance and Resource Effects Jay Veazey and Blaine Gaither Hewlett-Packard

CAECW 2008 -- Salt Lake City -- Veazey & Gaither

Varying Memory Size with TPC-CPerformance and Resource Effects

Jay Veazey and Blaine GaitherHewlett-Packard

[email protected]@hp.com

Page 2: CAECW 2008 -- Salt Lake City -- Veazey & Gaither Varying Memory Size with TPC-C Performance and Resource Effects Jay Veazey and Blaine Gaither Hewlett-Packard

CAECW 2008 -- Salt Lake City -- Veazey & Gaither

Motivation --- why is this interesting?

• More memory increases performance→ How much?→ Why exactly? → Reveal and quantify the underlying causes

• Focus is R&D tradeoffs→ Performance, cost, schedule, power→ How much memory to design into a commercial server?→ Is memory latency more important than memory size?

Page 3: CAECW 2008 -- Salt Lake City -- Veazey & Gaither Varying Memory Size with TPC-C Performance and Resource Effects Jay Veazey and Blaine Gaither Hewlett-Packard

CAECW 2008 -- Salt Lake City -- Veazey & Gaither

Experimental Design

• Vary memory 32-192 GBytes→ Measure

• Throughput• Resource utilization

– CPU, disk I/O, memory BW, CPI, OS context switches

• HP Integrity rx6600→ Itanium 2 9050 CPUs (2S/4C)→ About 750 disk drives

• TPC-C→ Resource intensive→ Standard, “coin of the realm”…easy to communicate→ Unofficial results

Page 4: CAECW 2008 -- Salt Lake City -- Veazey & Gaither Varying Memory Size with TPC-C Performance and Resource Effects Jay Veazey and Blaine Gaither Hewlett-Packard

CAECW 2008 -- Salt Lake City -- Veazey & Gaither

Throughput

Throughput vs Memory Size -- TPC-C SQL

120000

140000

160000

180000

200000

220000

240000

0 32 64 96 128 160 192 224 256

GBytes Memory

thro

ug

hp

ut

• Increase of 48% in throughput

Page 5: CAECW 2008 -- Salt Lake City -- Veazey & Gaither Varying Memory Size with TPC-C Performance and Resource Effects Jay Veazey and Blaine Gaither Hewlett-Packard

CAECW 2008 -- Salt Lake City -- Veazey & Gaither

Resource Utilization

• I/O reduction accounts for 20% of the 48% throughput improvement.

• Where’s the rest of it?

Disk I/O and CPU utilization

GB Mem thruput

CPU Util. IOs / sec

Relative thruput

approx. % insts. devoted to I/O

32 149,934 99.7% 71,068 1.00 31%

64 173,017 99.0% 58,907 1.15 24%

96 184,716 99.7% 50,574 1.23 20%

128 196,521 99.5% 44,397 1.31 17%

192 221,289 99.9% 29,422 1.48 11%

Page 6: CAECW 2008 -- Salt Lake City -- Veazey & Gaither Varying Memory Size with TPC-C Performance and Resource Effects Jay Veazey and Blaine Gaither Hewlett-Packard

CAECW 2008 -- Salt Lake City -- Veazey & Gaither

CPI and Memory

• As memory is added, CPU cycles are used more efficiently

• But this is an effect, not a cause---why does CPI fall?

CPI vs Memory Size

1.30

1.35

1.40

1.45

1.50

1.55

1.60

0 32 64 96 128 160 192 224 256

GBytes Memory

CP

I

Page 7: CAECW 2008 -- Salt Lake City -- Veazey & Gaither Varying Memory Size with TPC-C Performance and Resource Effects Jay Veazey and Blaine Gaither Hewlett-Packard

CAECW 2008 -- Salt Lake City -- Veazey & Gaither

CPI and Memory Bandwidth• CPI can change for many reasons, most irrelevant here

• Memory accesses are relevant– When a load misses cache, the delay counts toward CPI

Memory Size vs Bus BW

2,600

2,700

2,800

2,900

3,000

3,100

3,200

0 32 64 96 128 160 192 224 256

GBytes Memory

Bu

s B

W -

- M

by

tes

/ s

ec

Page 8: CAECW 2008 -- Salt Lake City -- Veazey & Gaither Varying Memory Size with TPC-C Performance and Resource Effects Jay Veazey and Blaine Gaither Hewlett-Packard

CAECW 2008 -- Salt Lake City -- Veazey & Gaither

Caches Stabilize with Increasing Memory

• Units normalized for throughput– accesses (or misses) / sec / CPU / tpmC

• L1 accesses imply that the registers also stabilize

memoryL1

accessesL1

missesL2

missesL3

misses

32 6901 1549 183 22

64 6219 1377 155 19

96 5943 1297 139 17

128 5683 1232 127 16

192 5122 1095 109 14

Page 9: CAECW 2008 -- Salt Lake City -- Veazey & Gaither Varying Memory Size with TPC-C Performance and Resource Effects Jay Veazey and Blaine Gaither Hewlett-Packard

CAECW 2008 -- Salt Lake City -- Veazey & Gaither

OS Thread Switches and Memory

• Reduced thread switches probably cause of register / cache stabilization --- working sets stay around longer

Thread Switches vs. Memory Size

3000

3500

4000

4500

5000

5500

6000

6500

7000

7500

8000

0 32 64 96 128 160 192 224 256

GBytes Memory

thre

ad

sw

itc

he

s / t

pm

C

Page 10: CAECW 2008 -- Salt Lake City -- Veazey & Gaither Varying Memory Size with TPC-C Performance and Resource Effects Jay Veazey and Blaine Gaither Hewlett-Packard

CAECW 2008 -- Salt Lake City -- Veazey & Gaither

Summary and Conclusions

• Adding memory increases performance significantly• I/O is reduced, as well as I/O instruction pathlength• Context switches are reduced as a result of less I/O

– Fewer memory accesses– Lower CPI– More stable caches and registers