Upload
xen-project
View
2.442
Download
1
Embed Size (px)
DESCRIPTION
Citation preview
Xen Summit AMD 2010
VM Memory Allocation Schemes and PV NUMA Guests
Dulloor Rao
Xen Summit AMD 2010
Agenda
● Motivation● VM memory allocation strategies –
CONFINED, SPLIT, STRIPED● AUTOMATIC (default) allocation scheme● PV NUMA Guests● Summary
Xen Summit AMD 2010
Motivation – NUMA Overheads
Xen Summit AMD 2010
Motivation – NUMA Overheads
● CPU0 and CPU1 are Hyper-Threads.
● CPU0 and CPU2 are on the same node.
● CPU0 and CPU8 are on different nodes.
● Overheads are due to both Cache Hierarchy (L1/L2/LLC) and Memory Organization (NUMA)
● Modified Cache Coherency State – Cacheline is present only in the current cache and is dirty. The cacheline is written back to main memory before any reads.
● Substantial overhead in accessing remote node's memory.
Xen Summit AMD 2010
Motivation – NUMA-related OS Optimizations (Linux as example)
● OS employs many optimizations to reduce inter-node memory accesses – memory management, scheduler, OS data-structures, etc.
● OS defines multiple NUMA allocation policies (MPOL_{DEFAULT/BIND/PREFERRED/INTERLEAVE}) to suit different applications. DEFAULT is local allocation.
● Significant performance improvement from system-level NUMA optimizations.
Xen Summit AMD 2010
Motivation – NUMA-related Application Optimizations (Linux)
● DEFAULT memory policy (of allocating from local node) and a NUMA-aware scheduler reduce the inter-node accesses.
● Libraries (numactl on Linux) are provided to select appropriate memory placement policy for specific application requirements.
● CONCLUSION – NUMA-related optimizations at OS-level and Application-level are too important and too many to ignore or discard.
Xen Summit AMD 2010
Motivation – Virtualization on NUMA platforms (Issues)
● Ad-hoc and Minimum-Effort VM memory allocation schemes.
● For instance, XEN tries to allocate all the memory for a VM from a single memory node and pin the VM to the node, for a one-to-one mapping between a VM and a node.
● Not always possible to allocate from a single node – VM size, node memory fragmentation, etc.
● Dynamic memory Interfaces (such as memory ballooning) could still disrupt the mapping, by allocating from some other node.
Xen Summit AMD 2010
Motivation – Virtualization on NUMA platforms (Issues)
Xen Summit AMD 2010
VM Memory Allocation Strategies
● CONFINED : Allocate the entire VM memory from a single node. Goal : Maximize performance.
● SPLIT : Allocate the VM memory from a set of nodes by splitting equally across the nodes. Goal : Maximize performance (with Enlightenment).
● STRIPED : Interleave the VM memory across a set of nodes. Goal : Predictable (average) performance.
Xen Summit AMD 2010
VM Memory Allocation Strategies - CONFINED
Xen Summit AMD 2010
VM Memory Allocation Strategies - SPLIT
Xen Summit AMD 2010
VM Memory Allocation Strategies - STRIPED
Xen Summit AMD 2010
Automatic VM Memory Allocation Scheme
● TRY : Allocate CONFINED using Best-Fit-Decreasing (BFD).
● TRY : Allocate SPLIT using Best-Fit-Decreasing (BFD), if the guest is NUMA-enabled. Enlighten the guest.
● Allocate STRIPED using First-Fit-Increasing (FFI).● BFD returns the minimal-subset of nodes.● FFI returns the maximal-subset of nodes. Used with
STRIPED to reduce the fragmentation of free node memory.
Xen Summit AMD 2010
VM Memory Allocation Strategy - SPLIT
● Used to construct a strict one-to-one mapping between virtual nodes and physical nodes.
● HVM : Export the VM memory layout using ACPI tables. VM constructs virtual nodes.
● PV : Export the VM memory layout using Virtual NUMA Enlightenment. VM constructs and maintains virtual nodes.
Xen Summit AMD 2010
PV NUMA Guest - Enlightenment
Xen Summit AMD 2010
PV NUMA Guest - Construction of Virtual Nodes
● Guest reads the Virtual NUMA Enlightenment using a hypercall.
● Guest constructs the (virtual) nodes and (virtual) cpu-to-node mappings.
● Guest (virtual) node distances reflect the actual distances between the underlying physical nodes.
Xen Summit AMD 2010
PV NUMA Guest – Construction of Virtual Nodes
Xen Summit AMD 2010
PV NUMA Guest – Maintenance of Virtual Nodes
● Dynamic memory interfaces could increase/decrease/exchange the VM memory reservations. Eg. Ballooning (Table in slide 7)
● Modify the interfaces to use Virtual NUMA Enlightenment. Maintain the strict mapping between Virtual and Physical nodes.
Xen Summit AMD 2010
PV NUMA Guest -Maintenance of Virtual Nodes
Xen Summit AMD 2010
PV NUMA Guest – Maintenance of Virtual Nodes
● Strict approach could lead to starvation in CONFINED/SPLIT VMs.
● Under memory pressure, relax the strict one-to-one mapping between virtual and physical nodes.
● Provide a mechanism to the guests to look-up physical node-id corresponding to a guest physical address.
● Periodically sweep through the VM memory and converge to original state (indefinitely).
Xen Summit AMD 2010
Results – linpack benchmark
Xen Summit AMD 2010
Summary
● VM Memory Allocation Strategies for NUMA – CONFINED/SPLIT/STRIPED.
● Automatic VM Memory Allocation Scheme.● NUMA Guests with SPLIT strategy :
● HVM – Inform using SLIT/SRAT ACPI tables● PV – Inform using Enlightenment
● PV NUMA Guests● Construction of Virtual Nodes● Maintenance of Virtual Nodes (Eg, Ballooning)
Xen Summit AMD 2010
Questions ?
Xen Summit AMD 2010
Thank You !