Upload
scot-tucker
View
222
Download
0
Tags:
Embed Size (px)
Citation preview
VMware Virtualization of Oracle and Java
Scott B. Drummonds
Tim Harris
Agenda
VMware Virtualization Overview
Architecture, performance, and overheads
Best Practices
Oracle, Java
Performance Data
Scaling, large memory pages, AMD RVI
Conclusions
VMware Virtualization of Oracle and Java
VMware Infrastructure 3 Architecture
VMkernel
Guest
PhysicalHardware
CPU resource is controlled byThe scheduler, and virtualizedby the monitor
Memory is allocated by the VMkernel, and virtualized by the monitor
Network and I/O devices are emulated and proxies though native device drivers
Monitor
Guest
MemoryAllocator
NIC Drivers
Virtual Switch
I/O Drivers
File System
Monitor
Scheduler
Virtual NIC Virtual SCSI
TCP/IPFile
System
VMware ESX Virtualization Architecture
Speeding Up Virtualization
Privileged instruction virtualization
Traps from de-privileging or ring compression to handle privileged instructions
Memory virtualizationMemory partitioning and allocation of physical memory
Device and I/O virtualization
Routing I/O requests between virtual devices and physical HW
Where are the various virtualization performance hits?
VMkernel
Guest
PhysicalHardware
There are different types of monitors for different workloads and CPU types
VMware ESX provides a dynamic framework to allowthe best monitor for theworkload
BinaryTranslation
MemoryAllocator
NIC Drivers
Virtual Switch
I/O Drivers
File SystemScheduler
Virtual NIC Virtual SCSI
Guest
Para-Virtualization
Guest
HardwareAssist
Multi-Mode Monitors
MMU Virtualization without hardware support
Guest maintains sets of page tables
In native execution, they would have been used for address translation
Virtual Machine Monitor (VMM) maintains a set of shadow page tables
There is one shadow page table for each guest page table
VMM sets CR3 to point to shadow page tables
Translation happens using shadow page tables
Guest page tables contain VPN->PPN translations, while shadows contain VPN->MPN translations
CR3
Guest page tables
Shadow page tables
VPN->PPN translations
VPN->MPN translations
Rapid Virtualization Indexing
RVI provides mechanism to avoid shadow page tables
Provides a second layer of page tables
Contain physical to machine address translations
Hypervisor maintains them
So, with RVI
Guest page tables contain virtual to physical translation
Second level tables contain physical to machine translation
Using two tables, hardware converts virtual address to machine address
guestCR3
nestedCR3
VPN -> PPN mapping
PPN -> MPN mapping
guest page tables
hypervisor page tables
What is Page Sharing?
Content-based
Hint (hash of page content) generated for 4K pages
Hint is used for a match
If matched, perform bit by bit comparison
COW (Copy-on-Write)
Shared pages are marked read-only
Write to the page breaks sharing
VM 1 VM 2 VM 3
Hypervisor
VM 1 VM 2 VM 3
Hypervisor
Expand
Shrink
May page contentout to virtual disk
May bring contentfrom virtual disk
Borrow Pages
LendPages
ESX Server Memory Ballooning
Guest OS has better information than VMkernel
Which pages are stale
Which pages are unused
Guest Driver installed with VMware Tools
Artificially induces memory pressure
VMkernel decides how much memory to reclaim, but guest OS gets to choose particular pages
VM with VMware Tools Installed
VM with VMware Tools Installed
Oracle Databases on VI3
Oracle Database Characteristics
What its not:
Its *not* a Huge I/O consumer
Most common Oracle databases have modest I/O profiles
It does *not* have a small memory footprint
Large-ish memory footprint and modest I/O most common
Tuning a DB for virtualization is *not* unique rocket science
Many standard tuning activities benefit virtualized DBs substantially
Capacity Planner Data for Oracle Databases
Out of 13K Physical Oracle DBs considered
65% of systems on 2 core systems, averaging 5% CPU utilization
Roughly 4% of systems fully consume more than 2 cores
Most consume between 2 and 4G of RAM
Static RAM consumption points to fixed SGA with fixed # of PGAs
Oracle DB Workload Characteristics
Memory
Large In-Memory Footprint (SGA)
Ensure good cache hit ratio
Target 98% or higher
Small number of processes to protect against TLB misses
I/O
Lower than generally assume (50 IOP per second average)
Depends on quality of SQL execution plans
Overall Privileged Instructions
I/O, Context Switch (TLB misses)
Making your DB Ready to Virtualize
Well Tuned DB for Physical is Good for Virtual
Poor Execution Plans even worse in Virtual
Cause poor cache re-use
Cause additional I/O and hence CPU Overhead
Cause additional impact on storage
Minimize Full Table Scans
Up to date statistics for CBO
Small number of “DB file scattered read” events
Tune SQL with high number of physical reads per execute
Virtualization Overhead for Oracle DBs
Well Tuned DB
Typically 10 to 20% additional CPU required over physical
Poorly Tuned DB
Maybe 20 to 30% or even more
Depends on SQL execution plans
User Impact of Additional CPU Requirements
Allocate additional CPU per VM to cover overhead
Results in minimal impact to user response time
If VM CPU pegged then expect substantial impact on user
Oracle Physical to Virtual Conversion Process
All but the Hungriest DBs will fit on a VM
Use smallest VM that will suffice
Ie. 1 vCPU VM more efficient than 2 vCPU if it fits
Current limit of 4 vCPU per VM
Will not exist for long
Limits virtualization of DBs that consume more than 3 physical cores
Appears to be a small number of all DBs Less than 5% in our surveys
General Best Practices for Virtualizing DBs
Characterize DBs into three rough groups
Green DBs – typically 70%
Ideal candidate for virtualization: Well tuned and modest CPU consumption
Yellow DBs – typically 25%
Likely candidate for virtualization May need some SQL tuning and monitoring to understand CPU
and I/O requirements
Red DBs – typically 5%
Unlikely candidates until larger VMs available
Consumes 4 or more physical cores
Not a lot of SQL tuning to be done
OLTP vs DSS Oracle Workloads and Virtualization
OLTP Workloads
Assume frequent small queries
Should hit efficient index almost all the time
Basic Diagnostics with AWR report Need small physical reads per exec, no full table scans
DSS Workloads
Should hit summary tables vs base tables as much as possible
Use materialized views to roll up as batch jobs at night
Daytime load should be index look ups May summarize delta from summary in real time when necessary
Direct I/O
Guest-OS Level Option for Bypassing the guest cache
Uncached access avoids multiple copies of data in memoryAvoid read/modify/write module file system block sizeBypasses many file-system level locks
Enabling Direct I/O on Linux
# vi init.orafilesystemio_options=“setall”
Check:
# iostat 3(Check for I/O size matching the DB block size…)
Asynchronous I/O
An API for single-threaded process to launch multiple outstanding I/Os
Multi-threaded programs could just just multiple threadsOracle databases uses this extensivelySee aio_read(), aio_write() etc...
Enabling AIO on Linux
# rpm -Uvh aio.rpm# vi init.orafilesystemio_options=“setall”
Check:
# ps –aef |grep dbwr# strace –p <pid>io_submit()… <- Check for io_submit in syscall trace
Use Large Pages
Guest-OS Level Option to use Large MMU PagesMaps the large SGA region with fewer TLB entriesReduces MMU overheads
Enabling Large Pages on Linux
# vi /etc/sysctl.conf (add the following lines:)
vm/nr_hugepages=2048vm/hugetlb_shm_group=55
# cat /proc/vminfo |grep HugeHugePages_Total: 1024HugePages_Free: 940Hugepagesize: 2048 kB
Linux Versions
Some older Linux versions have a 1Khz timer to optimize desktop-style applications
There is no reason to use such a high timer rate on server-class applications
The timer rate on 4vcpu Linux guests is over 70,000 per second!
Use RHEL5.1
Install 2.6.18-53.1.4 kernel or later
Put divider=10 on the end of the kernel line in grub.conf and reboot.
All the RHEL clones (CentOS, Oracle EL, etc.) work the same way.
Page Sharing and Large Memory Pages
Large Pages In Oracle
Can increase efficiency of memory management
Large Pages are Not Shared
Expect less reduction in memory consumption with large pages
Hardware Assisted Memory Management
Benefits from use of Large Pages
No other hypervisor uses large pages today
Expect AMD RVI and Intel EPT to work well with VMware Infrastructure
And likely not with hypervisors that don’t support large pages
Page Sharing With Oracle DBs
Page Sharing in Vmware Infrastructure
Reduce memory consumption by sharing common pages
Common pages include
OS related pages
Executable related pages
Ie. Oracle executables for each VM running Oracle
Serves to allow larger SGA with overall memory consumption reduction
Oracle Performance Study: SwingBench TPC-like Transaction Processing Benchmark
Order-entry benchmark: order & product processing
Java client generator with Oracle back-end
SwingBench Configuration
Database:• # of Users = 5,011,872 • # of Products = 5,011,872 • Db size = 5.11GB
Test• Test duration = 10 mins • # of Users per run = 30 • Think time = 0 • New customer - 11% • Browse products - 28% • Order products - 28% • Process orders - 5% • Browse orders - 28%
• RHEL5 U1• 64 bit• 2.6.18-53.1.13.el5
• ESX version: 3.5 build# 60217 • Number of VMs: 1• vCPU: 4 • Mem: 6GB• vDisk: 16GB• vNIC: 1
• Oracle 11g (11.1.0) • RHEL4 x86_64• SGA: 3GB • PGA: 1GB
• Dell PowerEdge 2950• Mem: 8GB• Two dual core) Intel(R) Xeon(R) CPU 5160 @ 3.00GHz processor 4GB cache• Storage: CX 3-40 (30 disks)
Measuring the Performance of DB Virtualization
SwingBench Oracle Single DB Scaling
Number of virtual CPUs in Database/Guest
Study: The Oracle DVD-Store Benchmark Simulate a large multi-tier application with Oracle as
the back-end database
Simulates DVD store transactions
Java client tier
Oracle Database
Sun 16-core x4600 M2VMware ESX 3.5Oracle 10G R2RHEL4, Update 4, 64-bit
EMC CLARiiON CX-34030 x 15k Spindles
Many Large Databases: Scaling Out What happens when we consolidate more than one
large database per host?
Increase number of large databases and measure performance
Key criteria: Throughput and Response Time
Scale DVD-Store Benchmark
From 1 to 7 Databases, each with their own VM
From 2 to 16 Physical CPU cores
From 32 to 256 GB of RAM
“Large” Database Consolidation Study
Oracle Performance (Response time)
Java on VI3
Java Workload Characteristics
CPU
Intensive; threads; not processes
Memory
Heavy
Network
Tends to be light
Storage
Tends to be light
Page Sharing Java
Common pages to OS
Common pages to JVM
Common application pages—only where apps are identical?
Garbage collection
Fewer zero pages
Tends to fill up assigned memory
Configurable through JVM?
Page Sharing and Large Memory Pages
Beware of the combination of large pages and memory over-commitment
large pages are not shared
when sharing is needed, large pages are backed by normal pages (4K)
This is a repeat of earlier slide in Oracle section…reconcile for final
VM Memory Over-commitment with Java
JVM is a VM within the OS
If balloon driver takes memory from JVM, access to JVM heap will force guest swapping
this is particularly bad with JVM heap access which tends to be random—no locality
Balloon and Swap Interaction
oom_killer
Java Config
Multiple JVMs known to outperform single large JVM
Requires app with a scale-out model
Scaling out VMs a better idea
DRS
Java Tuning
Understand
Objects created and put in Eden
After certain life, pushed to long-lived area
GC sweeps Eden aggressively and less so with long-lived area
So…
Eden sizing impacts memory access
GC thread count increases raise memory access profile and virtual overhead
This is another reason for using our model of multiple VMs each with their own JVM
JRockit
BEA
OS-less
Optimal out-of-box
Common App Tuning
Linux kernel 2.6.22.16 (check)
RHEL
SUSE – 250 Hz
Others
Performance Data
Java Scalability
Will Java scale to 16 cores?
If no, show graph.
SwingBench Oracle Single DB Scaling
Number of virtual CPUs in Database/Guest
“Large” Database Consolidation Study
Scaling to16 Cores,
256GB RAM!
Oracle Performance (Response time)
Storage Protocols: Sequential Read Throughput
Storage Protocols: Sequential Write Throughput
VMFS Performance: VMFS versus RDM