Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Virtualization Big Applications
Scott Drummonds
Group Manager, Performance Marketing
16 October, 2008
Overview
Myths and Realities
Case Studies
CPU Intensive Workloads
I/O and Network Intensive Workloads
Applications
Oracle, SAP, MS Exchange, SQL Server, MySQL
Relevant Architectural Directions
Myths & Perceptions
Virtualization is only for smaller, non-critical systems
Large applications have high overhead
I/O Intensive Applications have too much IO to be virtualized
Performance management will become complex
I can’t run too many performance critical VMs at the same time
Databases, Email Databases, Email Services, and High Services, and High
End Web Workloads End Web Workloads are all are all virtualizablevirtualizable
Large CPU, Memory Large CPU, Memory and I/O footprint and I/O footprint
applications have applications have been the focus for been the focus for
ESX since 3.0ESX since 3.0
ESX can drive over ESX can drive over 100,000 IOPS and 500 100,000 IOPS and 500
disks per host, disks per host, enough for 85 average enough for 85 average
databasesdatabases
ESXESX’’ss resource resource management controls management controls allow resources to be allow resources to be
allocated to create allocated to create isolation between isolation between
critical critical applicatonsapplicatons
NearNear--linear Scalability linear Scalability allows many virtual allows many virtual
machines to be machines to be consolidated without consolidated without additional overheadadditional overhead
Why Virtualize Larger or Mission Critical Apps?
For Consolidation
Leverage resource sharing: CPU, Memory, I/O Connectivity
Consolidate several peaky workloads to increase utilization
Need to leverage multi-core platforms most efficiently
For Availability
Minimize downtime via Virtual Infrastructure HA/VMotion/DRS
For Operational Efficiency
Want to leverage virtualization infrastructure already in place for other apps
Take advantage of increased administrative flexibility
Evolution of VI Performance for Large Apps
Ability to satisfy Performance Demands
GeneralPopulationOf Apps
ESX 2.x Future
2-10% Overhead8-way VSMP Scaling
256GB VM RAM128 core scaling512GBPhys RAM200,000 IOPS
40 Gigabits512 VMs
VI 3.5
10-20% Overhead4-way VSMP Scaling64-bit OS Support64GB VM RAM
64 core hosts256GB Phys RAM100,000 IOPS9 Gigabits
Gen-2 HW Virt
30-60% Overhead
2-way SMP3.6 GB VM RAM16 CPU support
4 CPU scaling64GB Phys RAM<10,000 IOPS380mbits max
VI 3.0
20-30% Overhead2-way SMP Scaling16 GB VM RAM16 core support
64GB Phys RAM800mbitsGen-1 HW Virt
MissionCriticalApps
100%
Enterprise-Class Performance (ESX 3.5)
CPU
Virtual SMP with 4 processor scalability90%-95% performanceof native server
IO
Extreme Disk IO supports 100,000+ DB IO per secondWire speed Network IOsupports 9 Gbps
Memory
64 GB per VM256 GB per hostAdvanced memory management
Consolidation is a way to use all those cores…
�
��
���
����
���� ���� ���� ���� ���� ����
������������������
���
����
� �������
�������
�����������������
���������
������ ��
!�"��������
Most applications don’t scale beyond 4/8 way
Virtualization provides a means to exploitthe hardware’s increasing parallelism
ESX
3.5
VMWare ESX Scaling:Keeping up with core counts
ESX
Futu
re
Example: Hardware Out-scales Web Servers
CPU Utilization Distribution
1
10
100
1000
10000
100000
0 20 40 60 80 100
% CPU Utilization
Num
ber
of S
yste
ms
Consolidation & Sizing
Consolidation targets are often <30% Utilized
Windows average utilization: 5-8%
Linux/Unix average: 10-35%
Recommendations on Virtualizing Big Apps
Candidate Selection: Which apps can I virtualize?
Size VMs Correctly
Setup Performance Monitoring Architecture
Use the right SW: e.g. 64-bit Implementations
Configure VMware ESX Appropriately
Configure the Virtual I/O Infrastructure Accordingly
Configure the Application Optimally
Sizing and Requirements
Virtual Machine sizing is different to Physical
Don’t just take the #cpus in the physical system as the vCPU requirement
Many Physical systems are sized for the peak utilization for with ample headroom for future growth
As a result, utilization is often very low in physical systems
With virtual machines, it’s not necessary to build headroom
For example, many databases running on 4-cpu systems can easily run in a 2-vcpu guest
Easy migration to future headroom: e.g. 8vcpu support
Common Questions for “Big Apps”Is VMware ESX able to scale up to the required number of CPUs/cores?Are the I/O overheads of virtualization too high?
Requirement MS Exchange Example
(Exchange 2007)
ESX Capability
SMP Scaling Large node: 4 cores 4 vCPU
Hardware Scaling Multiple VMs 32 pCPU today
Allows 8x4vcpu VMs
Memory 12GB @ 4 cores 64 GB per VM
Network 10mbit @ 2000 users 980 Mbits on GBE
>5Gbits on multiple GBE
9.8 Gbits on 10GBE
I/O 1000 IOPS @2000 users >100,000 IOPS on large disk configurations
CPU - Memory - I/O Ratios for Applications
Application Memory
(per core)
Disk IOPS
(per core)
Exchange 6GB 500
Oracle 2GB 320
SQLserver 2GB 240
DB2 2GB 160
Desktop 6GB 100
Web Servers 2GB 320
Assumption is that the core is 80% busyLighter VMs will need proportionally less resource
• 200,000 Mailboxes
• 1.3% Latency Overhead at Full load
• ESX layer is effectively transparent to high-I/O Apps
Can I Virtualize High I/O Applications?
Databases
Databases: Why Use VMs Rather than DB Virtualization?
Virtualization at hypervisor level provides the best abstraction
Each DBA has their own hardened, isolated, managed sandbox
Strong Isolation
Security
Performance/Resources
Configuration
Fault Isolation
Scalable Performance
Low-overhead virtual Database performance
Efficiently Stack Databases per-host
Measuring the Performance of DB Virtualization
������������ ��
����� ���� �����
Consolidating Microsoft SQL Server on ESX
�������������� ��� ��� ��� ��� ����� ��������
������#�� ���� �����#�� �$�% �$$� &'�% �(�'$#�� ���� ���� ��'� ����� �(&$�#�� $��� $��' $��$ $&�& ��%�& $(�&
C PU utiliz ation & Throughput
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
1-vm 2-vm 3-vm 4-vm
opm
0
5
10
15
20
25
30
35
aver
age
%C
PU
opm
% cpu
Oracle Performance (Response time)
Databases: Tuning Recommendations
Use 64-bit Database
Add enough memory to cache DB, reduce I/O
Use Direct-IO high performance uncached path in the Guest Operating System
Use Asynchronous I/O to reduce system calls
Use Large MMU Pages
Optimize Storage Layout, # of Disk Spindles
Databases: Storage Configuration
Storage considerations
VMFS or RDM
Fibre Channel, NFS or iSCSI
Partition Alignment
Multiple storage paths
OS/App, Data, Transaction Log and TempDB on separate physical spindles
RAID 10 or RAID5 for Data, RAID 1 for logs
Queue depth and Controller Cache Settings
SAP, Exchange
SAP Virtualization on ESX 3.5
DIA x 3CIDB
4 x vCPU16 Gb
CIDB
1 x vCPU4 Gb
CIDB
2 x vCPU8 Gb
1 2 4
# of vCPU
# of
SD
Use
rs
Native
Virtual
7.3 – 10.2%
Scaling Exchange (Natively) On a Single Server
Is storage the limit?No. Exchange 2007 on Windows Server makes excellent use of correctly configured storage.
Is CPU the limit?Possibly. The Mailbox Server’s recommended maximum is eight cores.
http://technet.microsoft.com/en-us/library/aa998874(EXCHG.80).aspx
Is memory the limit?Possibly. The maximum recommended memory allocation is 32G per server
http://www.microsoft.com/technet/prodtechnol/exchange/2007/plan/hardware.mspx?wt.svl=sysreqs
Maximum mailboxes: 8,000Eight cores at recommended 1,000 mailboxes per core
This results in providing 3.75MB/user at the server; this is in the middle of the 2-5MB/user recommendation
Multi-VM Scaling of Exchange on VI3
2 4 6 8 10 12 14 16
8K
6K
4K
2K
10K
12K
14K
16K
Mai
lbox
es
CPUs (or cores)
Building blocks stay withinMicrosoft recommendations:• 1,000 mailboxes/core• 5MB/mailbox
Maximum performance!
• Three building blocks breaks through pre-virtual memory boundaries
• Five building blocks shatters pre-virtual CPU limitations
• Eight building blocks enables 16,000 users
VI3 provides native-matching performance and
complete resource utilization
5 MB/mailbox perf.threshold
Eight core limitrecommendedby Microsoft
1,000 mailb
oxes/c
ore
Building Block Performance
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
1 VM 2 VMs 3 VMs 4 VMs 5 VMs 6 VMs 7 VMs 8 VMs
Mailboxes
Vir
tual
Mac
hin
es
0
50
100
150
200
250
300
350
400
450
500
Lat
ency
(m
s)
Mailboxes
SendMailLatency
Record-setting Server Performance
Performance Futures
Futures at a Glance
VMware Infrastructure is the Best Place to Run Enterprise Applications
Focus on large applications: Oracle, SQL Server, etc…
Increased Scalability
For each Virtual Machine
For ESX Host
Taking Advantage of Hardware Assist
Instruction level, Memory Management and Device Driver
Improved Efficiency with I/O activities
Paravirtualized SCSI Driver
Summary
Virtual Infrastructure ideally suited for managing and controlling large applications
VMware ESX has proven scale-up and scale-out performance
Flexible disk options allow for optimal disk layouts and storage configurations
Learn about the available tuning recommendations for ESX and specific applications
Resources
Performance communityhttp://communities.vmware.com/community/vmtn/general/performance
Performance web site
http://www.vmware.com/overview/performance
VROOM! performance blog
http://blogs.vmware.com/performance
Backup Slides
Can I Virtualize CPU Intensive Applications?
Most CPU intensive applications have very low overhead
VMware ESX 3.x compared to NativeSPECcpu results covered by O.Agesen and K.Adams PaperWebsphere results published jointly by IBM/VMwareSPECjbb results from recent internal measurements
Can I Virtualize High Networking I/O Applications?
Overall response time is lower when CPU utilization is less than 100% due to multi-core offload
• 200,000 Mailboxes
• 77Terabytes of Data
• Enough I/O for 80 largedatabases
• Enough to hold the entire printed US Library of Congress
EMC CLARiiON CX3-80
VMware ESX
Storage Fabric
• 500 Disks16 cores (Intel Tigerton)16 VMs (Windows 2003 Server)IO-intensive workload
8k block size100% randomMixed read/write
Mainframe class I/O in ESX 3.5
Summary
Generational improvement to VMware products has enabled the most demanding enterprise applications
VMware performance exceeds the needs of virtually every application
Guidance for setup and management of most common applications exist
Backup: DB Configuration
Oracle File System Sync vs DIO
Oracle DIO vs. RAW
Comparison of DBcache vs Guest FScache
Cache Rows/sec KernelCPU UserCPU
FScache 287114 28 71
DBcache 695700 5 94
• A 46GB table was populated, indexed, and then queried by 100 processes each requesting a range of data. A single row was retrieved at a time to simulate what would happen in an OLTP environment. The data was cached so that no IO occurred during any of the runs.
• Ref: Glenn Fawcett, PAE, Sun
Direct I/OGuest-OS Level Option for Bypassing the guest cache
Uncached access avoids multiple copies of data in memoryAvoid read/modify/write module file system block sizeBypasses many file-system level locks
Enabling Direct I/O for Oracle and MySQL on Linux
# vi init.orafilesystemio_options=“setall”
Check:
# iostat 3(Check for I/O size matching the DB block size…)
# vi my.cnfinnodb_flush_method to O_DIRECT
Check:
# iostat 3(Check for I/O size matching the DB block size…)
Direct I/OGuest-OS Level Option for Bypassing the guest cache
Uncached access avoids multiple copies of data in memoryAvoid read/modify/write module file system block sizeBypasses many file-system level locks
Enabling Direct I/O for MySQL on Linux
# vi my.cnfinnodb_flush_method to O_DIRECT
Check:
# iostat 3(Check for I/O size matching the DB block size…)
Asynchronous I/O
An API for single-threaded process to launch multiple outstanding I/Os
Multi-threaded programs could just just multiple threadsOracle databases uses this extensivelySee aio_read(), aio_write() etc...
Enabling AIO on Linux
# rpm -Uvh aio.rpm# vi init.orafilesystemio_options=“setall”
Check:
# ps –aef |grep dbwr# strace –p <pid>io_submit()… <- Check for io_submit in syscall trace
Use Large Pages
Guest-OS Level Option to use Large MMU PagesMaps the large SGA region with fewer TLB entriesReduces MMU overheadsESX 3.5 Uniquely Supports Large Pages!
Enabling Large Pages on Linux & SQLserver
# vi /etc/sysctl.conf(add the following lines:)
vm/nr_hugepages=2048vm/hugetlb_shm_group=55
# cat /proc/vminfo |grep HugeHugePages_Total: 1024HugePages_Free: 940Hugepagesize: 2048 kB
To set a trace flag in SQL Server 2005 you need to follow the steps below:
1. Open SQL Server Configuration Manager2. Select SQL Server 2005 Services3. Right click the SQL Server service and select
Properties4. Select the Advanced tab5. Edit the Startup Parameters property to set –T834
The thing to note is that you need to make sure that you separate the Trace Flag from other startup parameters with a semicolon (;) and don't leave a space.
ie.... mastlog.ldf;-T834
Linux Versions
Some older Linux versions have a 1Khz timer to optimize desktop-style applications
There is no reason to use such a high timer rate on server-class applications
The timer rate on 4vcpu Linux guests is over 70,000 per second!
Use RHEL5.1
Install 2.6.18-53.1.4 kernel or later
Put divider=10 on the end of the kernel line in grub.conf and reboot
All the RHEL clones (CentOS, Oracle EL, etc.) work the same way
Backup: Futures
vApp – New Model for Describing and Deploying Applications