Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Krste Asanović, Rimas Avizienis, Yunsup Lee, David Pa9erson, Andrew Waterman [email protected]
http://www.riscv.org 2nd RISC-‐V Workshop, Berkeley, CA
June 29, 2015
RISC-‐V Privileged Architecture Proposal
RISC-‐V Privileged Architecture
§ Provide clean split between layers of the soPware stack § ApplicaQon communicates with ApplicaQon ExecuQon Environment (AEE) via ApplicaQon Binary Interface (ABI)
§ OS communicates via Supervisor ExecuQon Environment (SEE) via System Binary Interface (SBI)
§ Hypervisor communicates via Hypervisor Binary Interface to Hypervisor ExecuQon Environment
§ All levels of ISA designed to support virtualizaQon 2
2 1.1draft: Volume II: RISC-V Privileged Architectures
ApplicationABIAEE
ApplicationABI
OSSBISEE
ApplicationABI
SBIHypervisor
ApplicationABI
OS
ApplicationABI
ApplicationABI
OS
ApplicationABI
SBI
HBIHEE
Figure 1.1: Di↵erent implementation stacks supporting various forms of privileged execution.
the OS, which provides the AEE. Just as applications interface with an AEE via an ABI, RISC-Voperating systems interface with a supervisor execution environment (SEE) via a supervisor binaryinterface (SBI). An SBI comprises the user-level and supervisor-level ISA together with a set ofSBI function calls. Using a single SBI across all SEE implementations allows a single OS binaryimage to run on any SEE. The SEE can be a simple boot loader and BIOS-style IO system in alow-end hardware platform, or a hypervisor-provided virtual machine in a high-end server, or athin translation layer over a host operating system in an architecture simulation environment.
The rightmost configuration shows a virtual machine monitor configuration where multiple multi-programmed OSs are supported by a single hypervisor. Each OS communicates via an SBI with thehypervisor, which provides the SEE. The hypervisor communicates with the hypervisor executionenvironment (HEE) using a hypervisor binary interface, to isolate the hypervisor from details ofthe hardware platform.
Our graphical convention represents abstract interfaces using black boxes with white text, toseparate them from actual components.
The various ABI, SBI, and HBIs are still a work-in-progress, but we anticipate the SBI and HBIto support devices via virtualized device interfaces similar to virtio [2], and to support devicediscovery. In this manner, only one set of device drivers need be written that can support anyOS or hypervisor, and which can also be shared with the boot environment.
Hardware implementations of the RISC-V ISA will generally require additional features beyond theprivileged ISA to support the various execution environments (AEE, SEE, or HEE), but these weconsider separately as part of a hardware abstraction layer (HAL), as shown in Figure 1.2. Note
ApplicationABIAEEHAL
Hardware
ApplicationABI
OSSBISEE
ApplicationABI
HALHardware
SBIHypervisor
ApplicationABI
OS
ApplicationABI
ApplicationABI
OS
ApplicationABI
SBI
HBIHEE
HardwareHAL
Figure 1.2: Hardware abstraction layers (HALs) abstract underlying hardware platforms from theexecution environments.
RISC-‐V Hardware Abstrac:on Layer
§ ExecuQon environments communicate with hardware plaXorms via Hardware AbstracQon Layer (HAL)
§ Details of execuQon environment and hardware plaXorms isolated from OS/Hypervisor ports
3
2 1.1draft: Volume II: RISC-V Privileged Architectures
ApplicationABIAEE
ApplicationABI
OSSBISEE
ApplicationABI
SBIHypervisor
ApplicationABI
OS
ApplicationABI
ApplicationABI
OS
ApplicationABI
SBI
HBIHEE
Figure 1.1: Di↵erent implementation stacks supporting various forms of privileged execution.
the OS, which provides the AEE. Just as applications interface with an AEE via an ABI, RISC-Voperating systems interface with a supervisor execution environment (SEE) via a supervisor binaryinterface (SBI). An SBI comprises the user-level and supervisor-level ISA together with a set ofSBI function calls. Using a single SBI across all SEE implementations allows a single OS binaryimage to run on any SEE. The SEE can be a simple boot loader and BIOS-style IO system in alow-end hardware platform, or a hypervisor-provided virtual machine in a high-end server, or athin translation layer over a host operating system in an architecture simulation environment.
The rightmost configuration shows a virtual machine monitor configuration where multiple multi-programmed OSs are supported by a single hypervisor. Each OS communicates via an SBI with thehypervisor, which provides the SEE. The hypervisor communicates with the hypervisor executionenvironment (HEE) using a hypervisor binary interface, to isolate the hypervisor from details ofthe hardware platform.
Our graphical convention represents abstract interfaces using black boxes with white text, toseparate them from actual components.
The various ABI, SBI, and HBIs are still a work-in-progress, but we anticipate the SBI and HBIto support devices via virtualized device interfaces similar to virtio [2], and to support devicediscovery. In this manner, only one set of device drivers need be written that can support anyOS or hypervisor, and which can also be shared with the boot environment.
Hardware implementations of the RISC-V ISA will generally require additional features beyond theprivileged ISA to support the various execution environments (AEE, SEE, or HEE), but these weconsider separately as part of a hardware abstraction layer (HAL), as shown in Figure 1.2. Note
ApplicationABIAEEHAL
Hardware
ApplicationABI
OSSBISEE
ApplicationABI
HALHardware
SBIHypervisor
ApplicationABI
OS
ApplicationABI
ApplicationABI
OS
ApplicationABI
SBI
HBIHEE
HardwareHAL
Figure 1.2: Hardware abstraction layers (HALs) abstract underlying hardware platforms from theexecution environments.
Privilege Modes
§ Four privilege modes - User (U-‐mode) - Supervisor (S-‐mode) - Hypervisor (H-‐mode) - Machine (M-‐mode)
§ Supported combinaQons of modes: - M (simple embedded systems) - M, U (embedded systems with protecQon) - M, S, U (systems running Unix-‐style operaQng systems) - M, H, S, U (systems running Hypervisors)
4
Simple Embedded Systems
§ Simplest implementaQon needs only M-‐mode § No address translaQon/protecQon - “Mbare” bare-‐metal mode - Trap bad physical addresses precisely
§ ApplicaQon code is trusted
§ Low implementaQon cost - 27 bits of architectural state (in addiQon to user ISA) - +27 more bits for Qmers - +27 more for basic performance counters
5
Small System Memory-‐Management Architectures
§ User mode (M+U) adds basic translaQon/protecQon § Mbb - Base-‐and-‐bounds translaQon/protecQon - PA = VA + mbase; VA ∈ [0, mbound-‐1] - Sufficient for basic mulQprogramming, provided segments fit in memory
- +26 bits of arch. state vs. Mbare § Mbbid - Separate base-‐and-‐bounds for instrucQons & data - Can share instrucQon segment between mulQple processes - +26 bits of arch. state vs. Mbb
6
Virtual Memory Architectures
§ Designed to support current Unix-‐style operaQng systems
§ Sv32 (RV32) - Demand-‐paged 32-‐bit virtual address spaces - 2-‐level page table - 4 KiB pages, 4 MiB megapages
§ Sv39 (RV64) - Demand-‐paged 39-‐bit virtual address spaces - 3-‐level page table - 4 KiB pages, 2 MiB megapages, 1 GiB gigapages
§ Sv48, Sv57, Sv64 - Sv39 + 1/2/3 more page table levels
7
Why 4 KiB Pages?
§ IniQally planned to scale base page size w/XLEN - 8 KiB for RV64, 16 KiB for RV128 - Greater TLB reach & smaller miss penalty
§ Concerns about porQng low-‐level soPware size += 4095; size &= ~4095; addr = mmap(0, size, …); § Internal fragmentaQon exacerbated by larger pages § Transparent superpage support kind of works
§ Upshot: bite the bullet; deal with 4K pages in μ-‐arch.
8
When 264 Bytes Doesn’t Cut It
§ RV128I naQvely supports much larger address space § Proposal: natural extension of RV32/RV64 schemes § Sv68, 76, 84, … - 7+ level page tables - MiQgate TLB miss cost by skipping levels in sparse regions - Considered, but rejected, inverted page tables
§ Also Sv44, 52, 60 for apps that want XLEN=128 but don’t need large VA spaces
9
Physical Memory ALributes
§ Most VM systems encode properQes of physical memory region in the page tables - Cacheability, write-‐through-‐ness, consistency model
§ Wrong place for this info - Granularity not necessarily Qed to page size - VirtualizaQon hole
§ Less applicable in SoC era - Phyiscal memory a9ributes may be known at design Qme - Cheap coherent DMA might render cacheability control irrelevant
10
Interrupts & Devices
§ Supervisor soPware sees only three kinds of interrupt - Timer interrupts - SoPware interrupts - Device interrupts
§ Device interacQons via virQo-‐style interface - Supports clean virtualizaQon - OS isolated from driver code
§ Can sQll support classic, virtualizaQon-‐unfriendly OS by running part of OS in M-‐mode
11
Supervisor Binary Interface
§ PlaXorm-‐specific funcQonality abstracted behind SBI - Query physical memory map - Enumerate devices - Get hardware thread ID and # of hardware threads - Save/restore coprocessor state - Query Qmer properQes, set up Qmer interrupts - Send interprocessor interrupts - Send TLB shootdowns - Reboot/shutdown
§ Simplifies hardware acceleraQon § Simplifies virtualizaQon
§ DraP SBI to be released with next priv. ISA draP
12
New Instruc:ons
§ Only 4 new instrucQons to support M+S modes - SFENCE.VM (synchronize page table updates) - ERET (excepQon return) - MRTS (redirect trap from machine to supervisor) - WFI (wait for interrupt)
13
Hypervisor Support
§ StraighXorward to support classic virtualizaQon - OS runs in user mode - Privileged operaQons trapped & emulated
§ HW-‐accelerated virtualizaQon (H-‐mode) planned but not yet specified
§ SBI abstracQons should simplify H-‐mode design
14
Implementa:on Status
§ DraP v1.7 in Spike, Rocket, BOOM, Z-‐scale - Sv32, Sv39 in Spike - Sv39 in Rocket, BOOM - Mbare => Mbb in Z-‐scale
§ Expect to tape out Rocket implementaQon in Sept.
§ SMP Linux port up & running on Spike
§ DraP spec v1.8 this summer § Frozen spec v2.0 this fall
15
Modest RISC-‐V Project Goal
16