RedGateWebinar - Where did my CPU go?

Preview:

DESCRIPTION

 

Citation preview

“Where did my CPU go?”“Where did my CPU go?”

Presented by:

Karl Arao1

www.enkitec.com 2

www.enkitec.com 3

whoami

Karl Arao

• Senior Technical Consultant @ Enkitec

• Performance and Capacity Planning Enthusiast

7+ years DBA experience

Oracle ACE, OCP-DBA, RHCE, OakTable

Blog: karlarao.wordpress.com

Wiki: karlarao.tiddlyspot.com

Twitter: @karlarao

www.enkitec.com 4

www.enkitec.com 5

200+

3

Agenda

• HOWTO compare CPU speeds

• Cores vs Threads

• The different CPU events

• CPU Monitoring/Capacity Planning on consolidated environments

www.enkitec.com 6

12:27:15 SYS@DEMO1> show parameter cpu_count

NAME TYPE VALUE------------------------------------ ----------- --------

www.enkitec.com 7

------------------------------------ ----------- --------cpu_count integer 16

Socket0

Core0

CPU0 CPU8

Core1

CPU1 CPU9

Core2

CPU2 CPU10

Core3

CPU3 CPU11

CPU4 CPU12 CPU5 CPU13 CPU6 CPU14 CPU7 CPU15

Exadata V2 => 2s8c16t

www.enkitec.com 8

Socket1

Core0 Core1 Core2 Core3

PART1: compare CPU speeds

www.enkitec.com 9

Different methods:

• Published benchmarks

– TPC-C

– SPECint_rate2006

– SPECpower– SPECpower

• Actual Benchmarking

– cputoolkit

– SLOB (lio test)

www.enkitec.com 10

TPC-C• Transaction Processing Performance Council (TPC)

• Throughput => transactions per minute (tpmC)

• Price/Performance => USD / tpmC

www.enkitec.com 11

• CPU performance => tpmC / core

• 1609186.39 / 16 = 100574

SPECint_rate2006• Standard Performance Evaluation Corporation (SPEC)

• SPECint_rate2006

• Integer performance

• All CPUs are used

• Used by OEM12c Consolidation Planner (SYSMAN.EMCT_SPEC_RATE_LIB)

• CPU performance => SPECint_rate2006/core

• 702/16 = 43.875

www.enkitec.com 12

$ cat spec.txt | grep -i sun | grep -i x3-2 | sort -rnk144.0625, 16, 2, 8, 2, 632, 705, Oracle Corporation, Sun Blade X3-2B (Intel Xeon E5-2690 2.9GHz)44.0625, 16, 2, 8, 2, 630, 705, Oracle Corporation, Sun Server X3-2L (Intel Xeon E5-2690 2.9GHz)43.875, 16, 2, 8, 2, 628, 702, Oracle Corporation, Sun Server X3-2 (Intel Xeon E5-2690 2.9GHz)

2007 vs 2012tpmC/core, System, tpmC, Price/Perf, Total System Cost, Currency, Database Software, Server CPU Type, Total Server Cores, Cluster, Date Submitted

Result/# Cores, # Cores, # Chips, # Cores Per Chip, # Threads Per Core, Baseline, Result, Hardware Vendor, System

www.enkitec.com 13

Actual Benchmarking• cputoolkit and SLOB (lio test)

• LIOs/sec

CPU1

CPU2

CPU1

CPU2

cputoolkit./runcputoolkit-auto <start CPU> <end CPU> <db name>

./runcputoolkit-auto 1 2 dw

SLOB./runit.sh <writers> <readers>

while :; do ./runit.sh 0 2; done

www.enkitec.com 14

CPU2

CPU3

CPU4

CPU5

CPU6

CPU7

CPU8

CPU2

CPU3

CPU4

CPU5

CPU6

CPU7

CPU8

Both at 25%

CPU utilization

V2 and X2 CPU perf comparison

3.6M LIOs/sec

2.1M LIOs/sec

www.enkitec.com 15

V2 -> X2 migration

V2 X2

www.enkitec.com 16

chip efficiency factor = (source LIOs/sec) / (destination LIOs/sec)

= 2.1M / 3.6M

= .5833

X2 CPU requirement = source host CPUs * utilization * chip efficiency factor

= 16 * .46

= 7.36 * .5833

= 4.29 CPUs

X2 CPU Utilization = CPU requirement / CPU capacity

= 4.29 / 24

= 17.8 %

v2, x3, x3

www.enkitec.com 17

PART2: Cores vs Threads

Socket0

Core0 Core1 Core2 Core3

PART2: Cores vs Threads

www.enkitec.com 18

Socket0

Core0

CPU1 CPU5

Core1

CPU2 CPU6

Core2

CPU3 CPU7

Core3

CPU4 CPU8

www.enkitec.com 19

~30% depends on the workload

cputoolkit SLOB

www.enkitec.com 20

17% 21%

Intel HT Technology Technical User's Guide http://goo.gl/3Ec5Z

PART3: Different CPU events

CPU

CPU WaitCPU Wait

CPU Scheduler

www.enkitec.com 21

AAS CPU

www.enkitec.com 22

www.enkitec.com 23

CPU Wait

www.enkitec.com 24

www.enkitec.com 25

CPU Scheduler

www.enkitec.com 26

www.enkitec.com 27

Putting it all together

www.enkitec.com 28

Instances Caged

at 12 CPUs

SQL Applied to lock

in good plan.

Problem: A single SQL Stmt. overwhelming

CPU resources.

PART4: CPU monitoring and

Capacity PlanningCapacity Planning

www.enkitec.com 29

OS Tools• The usual Operating System commands

– vmstat

– top

– mpstat –P ALL 1 5

• Cool tools

– collectl –sC (http://collectl.sourceforge.net)– collectl –sC (http://collectl.sourceforge.net)

– turbostat.c

– dcli (Exadata)

• dcli -l oracle -g /home/oracle/dbs_group --vmstat 2

• dcli -l oracle -g /home/oracle/dbs_group uptime

www.enkitec.com 30

Load Map

www.enkitec.com 31

Performance Page – Historical View

www.enkitec.com 32

AWR Toolkit

• DIY performance data warehouse

run_awr

run_extract

Extract AWR data points as csv files

Package all the csv filesCustomer site

FRESH_LOAD

CHECK_LOAD

DELTA_LOAD

Create new client ““““dimension”””” tables

Check new data points

Load new data points

DIY DW server

1

2

www.enkitec.com 33

DELTA_LOAD Load new data points

awr_topevents_(ClientNameX)

awr_cpuwl_(ClientNameX)

awr_iowl_(ClientNameX)

3 Tableau Analytics

awr_topevents_(ClientNameY)

awr_cpuwl_(ClientNameY)

awr_iowl_(ClientNameY)

awr_topevents_(ClientNameZ)

awr_cpuwl_(ClientNameZ)

awr_iowl_(ClientNameZ)

AWR data• Top Events

– AAS CPU, latency, wait class

• SYSSTAT

– PGA, SGA, physical memory, Executes/sec

• IO

– IOPS breakdown, MB/s

• CPU

– Load Average, NUM_CPUs, – Load Average, NUM_CPUs,

• Storage

– total storage size, per tablespace size

• Services

– distribution of workload/modules

• Top SQL

– PIOs, LIOs, modules, SQL type, SQL_ID, PX

Correlate across months of workload data! http://goo.gl/7uCk7w

www.enkitec.com 35

• Tableau auto creates a time dimension for the time

column “MM/DD/YY HH24:MI:SS” of AWR csv output

www.enkitec.com 36

• Summary and Underlying data

www.enkitec.com 37

1-2AM

2-3AM

CPU usage across half rack Exadata

www.enkitec.com 38

CPU usage per host

www.enkitec.com 39

CPU redistribution across nodes

www.enkitec.com 40

Wrap up!• HOWTO compare CPU speeds

o SPECint_rate2006, TPC-C, Actual benchmarking

• Cores vs Threads

o Always have HT on

o ~30% performance benefit after core count

• The different CPU events

o 1 AAS CPU = 1 CPU thread

o Oracle CPU may not correlate with Host CPU if you have a lot ofCPU activity outside of the database

• CPU Monitoring/Capacity Planning on consolidated environments

o AWR analytics

www.enkitec.com 41

Resources• White paper: http://goo.gl/eq9Sn

• cputoolkit - http://karlarao.wordpress.com/scripts-resources/

• AWR Tableau and R toolkit Visualization Examples - http://goo.gl/xZHHY

• AAS investigation - http://goo.gl/5WaAg

• Cores vs Threads - http://goo.gl/1MLFf

• Turbostat.c - http://goo.gl/jDUKg

• cpu_topology - http://goo.gl/EUDG7

• CPU centric benchmark comparisons - http://goo.gl/nR9Yy

• SLOB - http://goo.gl/yKa45

• Kyle Hailey - http://dboptimizer.com/2011/07/21/oracle-cpu-time/

• Book: Computer Architecture: A Quantitative Approach 5th Ed - Chapter1 Section1.10 Putting it all together Perf, Price, Power http://goo.gl/MXigAQ

• Book: The Art of Scalability - Ch11 “Headroom” http://theartofscalability.com

• The mindmap of this presentation - http://goo.gl/XeY0e

www.enkitec.com 42

karl.arao@enkitec.com

43

Fastest Growing Companies

in Dallas