© 2011 VMware Inc. All rights reserved
DBA Guide to Databases on VMware Solution Presentation - Don Sullivan – Senior Systems Engineer - Database Specialist
Don Sullivan – [email protected]• Oracle Certified Master, Server Products Trainer for Oracle University and
consultant with Oracle Advanced Technology Services - 1998-05. • Oracle SA for Polyserve/HP – 05-10• Vmware SE DB specialist 2010 – Present
4 Confidential
Agenda Introduction Understanding VMware Performance Designing Databases on VMware Developing and Testing Databases Migrating Existing Databases Securing the Databases Running Databases on VMware Monitor and Troubleshoot Database Performance
Introduction
The Trend…
• Large, multi-core servers becoming commodity• Increasing number of CPU cores, memory, network bandwidth • Traditional “one app one server” model is out dated
• Increasing demands for high availability• Business going global• 24x7 internet
• Economy demands to increase IT efficiency• Reduce operational costs, increase productivity• Reduce HW and SW costs
• Increasing manageability challenges and security concerns over database server sprawl
9
Reduce Infrastructure and Software License CostsReduce Infrastructure footprint through consolidation while maintaining full database isolation Increase utilization of software licenses
Accelerate Database Lifecycle from Dev to ProductionReduce provisioning times from weeks to minutesSelf-service provisioningEnable testing of databases with production clones
Experienced DBAs Look to VMware…
Improve Quality of ServiceBuilt-in HA provides protections to all database environments from production, development, to QASimple and Reliable Disaster Recovery manager per site instead of per databaseScale on demand to handle database spikes/peak utilization
Quality of Service
App Lifecycle
App Costs
Understanding VMware Performance
% o
f App
licat
ions
Application Performance Requirements
20,000
800 Mb/s
16 GB
2 vCPU
1. Source: VMware Capacity Planner assessments
ESX 3.5
100,000
9 Gb/s
64 GB
4 vCPU
ESX 4
> 350,000
30 Gb/s
255 GB
8 vCPU
< 10,000
380 Mb/s
< 4 GB
1 vCPU
ESX 3ESX 2
20% - 30% <10% - 20% <2% - 10%30% - 60% Overhead
>95% of Apps Match Native Performance on Virtual Machines
SQL Server Scale Up Performance Relative to Native
• At 1 & 2 vCPUs, ESX is 92 % of native performance At 1, 2 and 4 vCPUs on the 8pCPU server, ESX is able to
effectively offload certain tasks to idle cores.
• 4 vCPUs , 88% and 8 vCPUs 86 % of native performance
Single VM Performance: Well-Known Database OLTP Workload †
Tran
sact
ion
Rat
e (R
atio
to 1
-way
VM
)
Intel® Xeon® processor 5500 series based 8-pCPU serverRHEL 5.1
Oracle 11gR1In-house ESX Server
† A fair-use implementation of the TPC-C workload; results are not
TPC-C compliant
< 15% overhead for 8 vCPU VM
8,900 total DB transactions per second
Near-perfect scalability from 1 to 8 vCPUs
60,000 I/O operations/second
The average Oracle DB fits easily in a VM
CPU
VM8 vCPU
Oracle DB2-4 CPU
4% utilized
Memory
VM255GB
Oracle DB4-8GB
50% utilized
Disk IO
VM350,000 IOPS
Oracle DB1200 IOPS
Network IO
VM30 Gb/s
Oracle DB2 MB/S
Source: VMware Capacity Planner analysis of > 700,000 servers in customer production environments
IBM pSeries9 Power5 Cores
100% utilized
OnCourse Application125,000 total users
12,000 concurrent users
x868 virtual CPUs50% utilized
“ We have been able to virtualize our most demanding Oracle Databases on x86 servers. We now have the confidence that vSphere can handle our largest transaction-processing databases with ease.”
Rob Lowden, Director of IT at Indiana University
Migrating Oracle 10g from UNIX to vSphere
Designing Databases on VMware
vSphere High Availability Features
VMware HA• Detects operating system and hardware failures• Automatically restarts failed database virtual machine• Provides a simple and reliable first line of defense for all databases• Can be used in conjunction with Symantec App HA to provide application
aware protections
VMware vMotion• Enables live migration of database virtual machines from one physical
server to another without service interruption• Can reduce virtual machine planned downtime• Perform host maintenance any time of the day
VMware DRS• Monitors state of virtual machine resource usage• Can automatically and intelligently locate virtual machine• Directs compute resources where needed • Maintains database response time and SLAs
21
Hot-Add Capacity1 vCPU
2 GB4 vCPU64 GB
VMotion to More Powerful Host
Provision Additional App Instance in Minutes
Dynamic Scaling on
VMware
Scalability on Demand
Har
dwar
e Fa
ilure
Tol
eran
ce
Application Coverage
VMware FT
Unprotected
AutomatedRestart
Continuous
0% 10% 100%
VMware HA
VMotion(Planned Downtime)
SQL Failover Clustering / Oracle RAC
SQL Database Mirroring / Oracle
Data Guard
Transforming Availability Service Levels
Clustering too complex and expensive for most applications VMware HA provide simple, cost-effective availability VMotion provides continuous availability against planned downtime
VMware vCenter Site Recovery Manager™ (SRM)
• Relies on storage replication• Allows creation, maintenance, and execution of automated process to
facilitate site recovery• Safe testing without impacting production environment• Self-documenting
Conventional DB Consolidation is Difficult
Multi-Instancing Shared Instance
Shared OSorcl orcl orcl orcl orclDB DB DB DB DB
Shared OSShared Instance
DB DB DB DB DB
• No OS isolation (configuration, security, fault)
• No load balancing across physical nodes
• No OS isolation (configuration, security, fault)
• No Database isolation
• Resource isolation depends on DBMS Resource Governor
• No load balancing across physical nodes
Ideal Platform for DB Consolidation
DBLegacy DB
DB DB DB DBDB
DB DBDB
Fast consolidation with P2VIncrease performance!
1
2 Preserve isolation in VMOS isolationDB isolationSecurity isolation
3 Guarantee resourcesReservationsPrioritiesMaximums
4 Load balance across nodes
vMotionDRS
Developing and Testing Databases
30
Provisioning on Demand
31
Fast, Self-Service Provisioning
DBA
Lab Manager (and vCloud)Developer /QA
Streamline Testing with Snapshots and Clones
ProductionTest
Archive for Fast Roll-back
Exact copy of production
12
Run more tests faster
3
Move changes into production
4
>Faster testing>More accurate testing on exact production copy>Lower cost testing infrastructure
vAppOS
WebOSAPP
OSDB
vAppOS
WebOSAPP
OSDB
vAppOS
WebOSAPP
OSDB
Migrating Existing Databases
P2V with vCenter Converter
• Easy to use, wizard driven process
• Converts multiple local and remote physical database servers simultaneously with centralized management console
• Creates one-to-one mapping from physical server to database virtual machine
• Stop database services (leave OS running) for hot cloning of database server to ensure data consistency
New Database Installation
• Install new OS and database software on VM, then migrate data from physical server
• Works well when planning database upgrade with migration
• Works with VMware Templates and Clones for rapid deployment of multiple databases
• With RDMs, data can swing over without backup/copy/restore
• Minimize downtime• No additional storage requirement for migration
• When used with native database replication features (such as mirroring, log shipping), the database VM can run side-by-side with the physical server to minimize migration downtime
Securing the Databases
Better-than-Physical Security
• More granular security compared to native database consolidation
• Minimize the database “surface area” per VM
• Allows customization of security at VM level• Install/enable components and features as needed• Enable network protocols as needed• More selective administrative and db owner privileges
• Database patching and change management less risky
Running Databases on VMware
Reduce Plan and Unplanned Downtime
Protect Databases against Hardware Failures
• Built-in, host based high availability
• Simple to configure and easy to manage
• Protects against hardware or operating system failures
• Provide first line of defense for all databases on the host, including production, development, QA, and etc.
VMware HA with Database Mirroring for Faster Recovery
• Works in conjunction with native database high availability features
• Protection against HW/SW failures and DB corruption
In-guest Backup
• Standard method for physical or virtual
• Agent runs in the VM guest and handles database quiescing
• Data is sent over the IP network
• Can affects CPU utilization in the guest OS
Array-based Backup
• Backup vendor software coordinates with VSS to create a supported backup image of the databases
• Snap-shotted databases can later be streamed to tape as flat files with no IO impact to the production databases
Manage Patch Upgrade
• The Challenges• Patch/upgrade may introduce bugs, regressions• Uninstalling a patch/upgrade may not be possible• Rolling back a patch/upgrade requires a rebuild of environment, and
restore data from backup
• VMware Solutions• Enable testing with production clones, reduces the risk of regression• VMware Snapshot
Creates a snapshot of the state and data of the database virtual machine at a specific point in time
Allows DBAs to easily revert back to the original state of the database virtual machine before the upgrade
Manage Legacy Databases
• The Challenges• Organizations need to maintain legacy database due to
regulatory/compliance requirements, and other reasons• Legacy databases not are upgradable due to vendor support, HW
compatibilities issues• Older hardware tends to fail more frequently
• VMware Solutions• VMs can be cloned and stored in a virtual vault/archive, then powered
on in the event of an audit or discovery request• Virtualization abstracts the OS/app from the underlying hardware,
enables legacy database to run on the latest hardware• Legacy database performance can be improved significantly by
moving to the latest hardware
Monitor and Troubleshoot Databases Performance
Host Level Monitoring
• vSphere Client:• GUI interface, primary tool for observing
performance and configuration data for one or more ESX/ESXi hosts
• Does not require high levels of privilege to access the data
• Resxtop/Esxtop • Gives access to detailed performance
data of a single ESX/ESXi host• Provides fast access to a large number
of performance metrics• Requires root-level access• Runs in interactive, batch, or replay
mode
Key Metrics to Monitor
Resource Metric Host / VM Description
CPU%USED Both CPU used over the collection interval (%)
%RDY VM CPU time spent in ready state
%SYS Both Percentage of time spent in the ESX Server VMKernel
MemorySwapin, Swapout Both Memory ESX host swaps in/out from/to disk (per VM,
or cumulative over host)
MCTLSZ (MB) Both Amount of memory reclaimed from resource pool by way of ballooning
Disk
READs/s, WRITEs/s Both Reads and Writes issued in the collection interval
DAVG/cmd Both Average latency (ms) of the device (LUN)
KAVG/cmd Both Average latency (ms) in the VMkernel, also known as “queuing time”
GAVG/cmd Both Average latency (ms) in the guest. GAVG = DAVG + KAVG
Network
MbRX/s, MbTX/s Both Amount of data transmitted per second
PKTRX/s, PKTTX/s Both Packets transmitted per second
%DRPRX, %DRPTX Both Drop packets per second
Database VM Level Monitoring
• The primary tools and methodologies for monitoring database performance have not change
• Monitoring tools• SQL Server: Perfmon, Profiler, Dynamic Manage Views• Oracle: Statspack\AWR• Time-based metrics reported in in-guest tools may not be accurate,
use host level monitoring tools
• Focus on identifying bottlenecks instead of time-based measurements• CPU bottleneck: high processor queue length• IO bottleneck: high disk queue length
Host CPU Saturation
• Typical symptoms• DB Instance
Sluggish performance with no appearance CPU, memory, disk resource issue
• ESX Sustained high host CPU utilization, with avg. > 75%, peak > 90% High VM Ready time
• Common causes• CPU over commitment• Unexpected guest VM CPU saturations driving up the host CPU usage
• Solutions• Use vMotion or DRS to redistribute VMs to other hosts• Use resource controls to ensure resource is available to DB InstanceVMs• Check hardware assisted virtualization is enabled
Guest Memory Misconfiguration
• Typical Symptoms• Oracle\SQL Server
Low buffer cache hit ratio, low page life expectancy, high number of lazy writes, high number of checkpoint pages/sec
• ESX Ballooning > 0
• Common Causes• Misconfiguration of Instance memory and/or insufficient ESX memory
reservation for VM • Solutions
• Set VM memory reservation = memory provisioned• Set policies which disallow over committing of CPU resources• Analyze vCPU utilization and verify that vCPUs are not idle
Monitoring Disk Performance with ESXTOP
• Rule of thumb: • GAVG/cmd > 20ms = high latency!
• What does this mean?• Latency when command reaches device is high.• Latency as seen by the guest is high.• Low KAVG/cmd means command is not queuing in VMkernel.
…
very large valuesfor DAVG/cmd and GAVG/cmd
Insufficient Disk Sub-System
• Typical Symptoms• Oracle\SQL Server
High number of waits for PAGEIOLATCH_EX, PAGEIOLATCH_SH
• ESX Disk Latency is high, GAVG/cmd > 20ms
• Common Causes• Overloaded or misconfigured storage sub-system• Sub-optimal query execution plan
• Solutions• Make sure devices are configured properly (caches, queue depths)• Check networking settings (for iSCSI/NAS)• Increase memory to reduce need for disk access• Index tune queries to reduce the number of IOs• Use storage vMotion to balance load across storages systems
Resources
• Visit our partner central for Solutions Toolsets http://www.vmware.com/partners/partners.html
• Running business critical applications on VMware http://www.vmware.com/solutions/business-critical-apps/ Best Practices, Reference Architectures, and Case Studies Microsoft Apps (Exchange, SQL, SharePoint) Oracle SAP
• Performance White Paper http://www.vmware.com/resources/techresources/
• Performance User Community http://communities.vmware.com/community/vmtn/general/performance