If you can't read please download the document
Upload
paul-v-novarese
View
275
Download
1
Embed Size (px)
Citation preview
Paul V. NovareseSr. Technical Account Manager11 September 2014
Performance: Observe and Tune
What can we do out of the box?
What is tuned?
Tuning profile delivery mechanism
Red Hat ships tuned profiles that improve performance for many workloads...hopefully yours!
Tuned: Storage Performance Boost
tuned Profile Summary: RHEL6
Tuned: Updates for RHEL7
Installed by default!
Profiles automatically set based on install type:
Desktop/Workstation: balanced
Server/HPC: throughput-performance
Single tuned.conf file
Optional hook/callout capability
Inheritance (cf. httpd.conf)
Profiles updated for RHEL 7 features (obv)
tuned Throughput Profiles: RHEL 7
tuned Latency Profiles: RHEL 7
tuned Virt Profiles: RHEL 7
Let's get our hands dirty...
Tuning Strategies:Bang for your Buck
Problem Statements
Bad
It's slow
Make it go faster
Better
We expect 37 gigaflups/year but we only see 24
We have a bottleneck to a particular LUN in the SAN
Turn bad statements into good statements
Determine victory conditions
Get data
Look at data
Tweak
GOTO 10
Questions to ask
What's actually slow?
How do we know it's slow?
What is the expectation and what is that based on?
What is actually needed to win?
What changed?
How long has it been slow?
Gradual or sudden change?
Are there patterns? (same time every day?)
Can you do something to (temporarily) recover?
What evidence do you have? (sar, iostat, etc?)
Identify bottlenecks
CPU
Memory
IO
Network
Application
Firmware
Basic IO Tuning Strategy
Multiple HBAsInstall (eg) device-mapper-multipath
Default settings in /usr/share/doc/device-mapper-multipath-0.4.7/multipath.conf.defaults
Understand storage features / limitations Maximum random and sequential read and writes per port
Maximum random and sequential read and writes for the controller
Low level I/O numbersTools to use dd , aiod , aio-stress, IOzone
Run I/O representative of the database implementation
I/O SchedulersCFQ, Deadline, AS, Noop
IO Schedulers
4 tunable I/O SchedulersCFQ elevator=cfq. Completely Fair Queuing default, balanced, fair for multiple luns, adaptors, smp servers
NOOP elevator=noop. No-operation (uses FIFO) in kernel, simple, low cpu overhead, leave opt to ramdisk, raid cntrl etc.
Deadline elevator=deadline. Optimize for run-time-like behavior, low latency per IO, balance issues with large IO luns/controllers. Batches IO ops to produce predictable latencies.
Anticipatory elevator=as. Inserts delays to help stack aggregate IO, best on system w/ limited physical IO SATA
Changing I/O Schedulersecho deadline > /sys/block//queue/scheduler
Append 'elevator=' to end of kernel line
Basic CPU Tuning Strategy
Limit CPU accessOne or more processes can consume all cpu cycles
Completely Fair Scheduler (CFS) in RHEL6 uses scheduler groups to assign different weights to each group
Configure cgroups and set cpu.shares for each group
Manually balance interruptscat /proc/interrupts to see how interrupts are distributed to each cpu
Edit /etc/sysconfig/irqbalance and set IRQBALANCE_BANNED_CPUS=
As an alternative, echo 1 > /proc/irq/142/smp_affinity
Pin processes to a specific CPUTaskset (non-NUMA)
Numactl
Cgroups
Utilize real-time scheduling (nice, MRG)
Basic VM Tuning Strategy
Huge Pages2MB huge page size
Set value in /etc/sysctl.conf (vm.nr_hugepages)
Benefits - https://access.redhat.com/knowledge/solutions/2592
Enabling - https://access.redhat.com/knowledge/solutions/46326
Transparent Huge Pageshttps://access.redhat.com/knowledge/solutions/46111
NUMALocalized memory access for certain workloads improves performance
SwapSet value of vm.swappiness (Default 60) lower number is better for interactive applications and avoids swapping as much as possible
VM Tuning Frequent Fliers
/proc/sys/vm/swappinessShould I swap or drop cache?
/proc/sys/vm/min_free_kbytesBe careful adjusting this! Extremes are bad.
/proc/sys/vm/dirty_ratio
/proc/sys/vm/dirty_background_ratio
/proc/sys/vm/vfs_cache_pressure
80/20 Rule
More like 95/5
At some point our time and effort is best spent elsewhere
What tools can we use?
sariostatperf - Userspace tool to read CPU counters and kernel tracepointsPerformance Co-Pilot (pcp) new in RHEL 7
Divider Slide
Tools
Tradition: start with sar
Built-in
Collects stats for all four major system components (cpu, memory, IO, network)
Data can be easily graphed
Data collection frequency can be easily changed
RHEL 6 sar metadata is different than RHEL 5 - you cannot use RHEL 6 sar to read RHEL 5 sar files.
Collectl
More complex, but more powerful
Can handle NFS, Slab data, and sub-second intervals (i.e. -i .25)
Very low overhead (