- 1. Virtualization Susanta K. Nanda ECSL CSE 502, Fall05
2. Virtualization at the Hardware Level
-
- Hardware resources aretypicallyunder-utilized
-
- Hardware resources directly relate to cost
- Goal: Improve hardware utilization
-
- Share hardware resources across multiple machines
-
- May make sense for network attached storage, but what about
processor, memory, etc.?
-
- Decouplemachine from hardware
-
- A machine decoupled from the hardware, i.e. does not
necessarily correspond to the hardware
-
- Multiple Virtual Machines on the same physical host could share
the underlying hardware
-
- First VM: IBM System/360 Model 40 VM [1965]
3. Virtual Machine Monitor (VMM)
- A thin layer of software on top of the bare machine to
facilitate virtualization of hardware resources
- Mediates between VMs and the hardware
-
- Create, Destroy, Power Off/On, etc.
-
- Isomorphism : State transitions must be isomorphic to a
physical nachine
-
- Isolation : One VM from all others
-
- Performance : Close-to-native
-
- Correctness : Exactly same hardware interface to the guest OS
to support commodity OSes without any modification
4. A Stolen Picture 5. VM: Additional Advantages
-
- Virtual devices through emulation via a combination of software
and other available devices
-
- Example: SCSI-disk using IDE-disks, (virtual) timer
-
- Use: Legacy systems/software
- Hides heterogeneity of the underlying hardware
-
- Ability to switch hardware vendors
-
- Decoupling helps move a VM from one physical host to another,
just as a file
-
- Use: Server consolidation, hardware maintenance, etc.
- OS Debugging, Mixed OS, Event monitoring, Execution Undo, and
Many more
6. Key Concepts: Appearance
- A VM consists of Shared and Dedicated Hardware
-
- Shared: Disk, Memory, NIC, CPU, Printer, etc
-
- Dedicated: Keyboard, Mouse, Display, Speakers, CD-Drive,
etc
-
- A server VM may not require some dedicated devices
-
- Sharable across multiple VMs if they belong to the same
user
7. Key Concepts: State Management
- Each VM would have itsownarchitected state information
-
- Example: registers/memory/disks, page table/TLB
- Not always possible to map all architected states to its
natural level in the host
-
- Insufficient/Unavailable host resources
-
- Example: Registers of a VM may be architected using main memory
in the host
- VMs keep getting switched in/out by the VMM
-
- Isomorphism requires all state transitions to be performed on
the VM states
-
- Performance requires efficient state management
- State Management:IndirectionVs.Copying
8. Key Concepts: State Managementcontd
-
- Holdstatefor each VM in fixed locations in the hosts memory
hierarchy
-
- Apointermanaged by VMM indicating the guest state that is
currentlyactive
-
- Example: Register block maintained in memory and a processor
register pointing to the register block of the currently active
VM
-
- Cons: Inefficient ( mov eax ebxrequires 2 inst)
-
- Copy VMs state information to its natural level in memory
hierarchy whenswitched in
-
- Copy them back to the original place whenswitched out
-
- Example: Copy all the VM registers to the processor
registers
-
- Pros: Efficient (most instructions are executed natively)
9. Key Concepts: Resource Control
- VMM must maintainoverall controlof the hardware resources
-
- Hardware resources are assigned to VMs when they are
created/executed
-
- Should have a way to get them back when they need to assigned
to a different VM
-
- Similar to multi-programming in OS
-
- Certain resources are accessible only to and managed by
VMM
-
- Interrupts relating to such resources must then be handled by
VMM
-
- Privileged resources are emulated by VMM for the VM
- All resource that could help maintain control are marked
privileged
-
- Interval timer is used to decide VM scheduling
-
- Page table base register (CR3 on x86) is used to isolate VM
memory
- Issues: VM scheduling (An ideallyfairscheduling may not be
good)
10. Key Concepts: Native/Hosted VMs
-
- VMM is installed on the bare machine, no host OS
-
- All other VMs are then created through the VMM
-
- Pros: Clean Architecture, Efficient
-
- Cons: Complicated VMM due to device drivers
-
- Example: VMware ESX Server
-
- VMM is installed on top of a host OS
-
- User-mode: VMM runs in non-privileged mode
-
- Dual-mode: VMM runs partly in privileged mode (as a driver on
the host OS) and partly in unprivileged mode (like an
application)
-
- Pros: VMM uses drivers in the host OS for I/OThin VMM
-
- Cons: Inefficient for I/O intensive applications
-
- Example: Microsoft Virtual Server
11. Processor Virtualization
-
- Guest ISA may differ from Host ISA
-
- Guest and Host ISA must be the same
-
- Some critical instructions may still need to be emulated
-
- Issues: Complexity of discovering and
emulatingcriticalinstructions efficiently
12. ISA Virtualizability
- Privileged Instructions (PI)
-
- Instructions that generate a trap when executed in any but
most-privileged level
-
- Example: LIDT (load interrupt descriptor table)
- Sensitive Instructions (SI)
-
- Instructions whose behavior depends on the current privilege
level
-
- Example: POPF (pops the stack to EFLAGS)
-
-
- In user mode, the Interrupt Enable bit of the ELAGS register is
not over-written
-
-
- In system mode, the value is blindly copied
-
- For any conventional third-generation computer, a virtual
machine monitor may be constructed if the set ofsensitive
instructionsfor that computer is a subset of the set ofprivileged
instructions .
-
- In other words, ISA is Virtualizable if and only if SI is a
subset of PI
13. When ISA is not Virtualizable?
- All is not lost if an ISA violates Popek/Goldberg theorem
-
- However, it brings in additional complications and inefficient
in VMM implementation
-
- Instructions that are sensitive but not privileged
-
- X86 has 17 critical instructions
-
- All critical instructions must be emulated by VMM
-
- Binary Scanner: Inspects and inserts trap at critical
instructions
-
- Dispatcher: Gets control when a trap occurs
-
- Allocator: Allocates machine resources (e.g. load relocation
bounds register)
-
- Interpreters: Each interpreter interprets one privileged
instruction
14. Memory Virtualization
- VM support in traditional architectures
-
- Architected TLB vs. Architected Page Table
-
- One level of indirection: Page Table
- VMM requires two levels of indirection
-
- Virtual Memory to Real Memory: Page Table (Guest OS)
-
- Real Memory to Physical Memory: Real Map Table (VMM)
-
- Additional Data Structures
-
-
- Shadow Page Tables (VMM): Used by hardware for address
translation, directly maps virtual address to physical (not real)
address
-
-
- VMM intercepts and emulates Page table modifications, Page
table base register modifications by the Guest OS
15. Memory Virtualizationcontd
-
- Virtual TLB: maintained by guest OS
-
-
- Virtual ASID, Virtual Page, Real Page
-
- Real TLB: maintained by VMM
-
-
- Real ASID, Virtual Page, Physical Page
-
- VMM intercepts/emulates all modifications to TLB by the guest
OS
16. I/O Virtualization
-
- Dedicated Devices: Display, Keyboard, Mouse, etc.
-
- Partitioned Devices: Disk
-
- Shared Devices: Network adapter
-
- Non-existent Physical Devices: virtual network adapter
- Virtualizing I/O Operations
-
- Intercepting/emulating IN/OUT, INS/OUTS
-
- Map virtual resource ID to physical device ID
-
- De-multiplexing the interrupts for the devices
- Virtualizing I/O in Hosted VMM
-
- VMM-driver translates I/O instructions back to system calls in
the host OS
17. Performance Degradation in VMMs
- Setup: VM State initialization
- Emulation: Emulatingcriticalinstructions
-
- Interrupts generated by a program within a VM has to be first
handled by VMM even though its not required sometimes
- State Saving: During world switches
- Time Elongation: Memory references take longer
18. VT-x: Vanderpool Technology
-
- VMX Root and VMX Non-root
-
- All four privilege level (rings) are available in both root and
non-root in VMX mode
-
-
- Thus, four new less privilege levels than Pentiums
-
- Guest VMs can run in VMX non-root
-
- Host (Hosted VMM) and VMM in VMX root
-
- VMX root has access to a new set of instructions
-
- Critical shared resources are kept under the control of a
monitor in VMX root
-
- VMX non-root ring 0 does not have access to the critical
resources
-
- An example of a critical resource: Memory for state
management
19. An Example Operation
- VMXON:Switch into VMX mode: To VMM
- VMLAUNCH VM1 : Start executing VM1 in VMX non-root
operation
- VM1 Exits: Go back to VMM
- VMLAUNCH VM2:Start executing VM2
- VM2 Exits: Go back to VMM
- VMRESUME VM2:Switch to VM2 again
- VM2 Exits: Go back to VMM
- VMRESUME VM2 : Switch to VM2
- VMRESUME VM1:VM2 exits, VM1 switched in
- VMXOFF : Get back to Regular mode
20. Maintenance of State
-
- Fully specified, various fields defined
-
- Manipulatedonlyby hardware or software in VMX-root
-
- VMPTR points to the VMCS structure of the current executing
VM
-
- There can be multiple VMs active at any point, but one of them
would be executing
-
- VMWRITE/VMREAD to read contents of VMCS
-
- State: More than normal, e.g. architecturally hidden part of
segment registers
- Control Fields: Define under what condition a VM exits
-
- Example: Some specific interrupt/instruction/etc, number of
model-specific registers (MSRs) that need to be saved when VM
exits
-
- Informs the VMM the reason for exit along with supporting
info
21. Maintenance of Statecontd
-
- Guest State: Register state, Interruptibility state
-
- Host State: Register State
-
-
- Pin/Processor-based execution controls, bitmap fields, etc
-
-
- Control bitmap, MSR Controls
-
-
- Control bitmap, MSR Controls, Controls for Event Injection
-
- Basic Info: VM-Exit Info, Vectoring Event Info
-
- Other Exit Info: Due to event delivery, due to instruction
execution