Upload
oro
View
38
Download
0
Embed Size (px)
DESCRIPTION
Computer Architecture Virtual Memory (VM) – x86. By Dan Tsafrir, 30/5/2011 Presentation based on slides by Lihu Rappoport. http://www.youtube.com/watch?v=3ye2OXj32DM (funny beginning). Reminder: VM motivation. VM provides Illusion of large memory Illusion of contiguity - PowerPoint PPT Presentation
Citation preview
Computer Architecture 2011 – VM x861
Computer Architecture
Virtual Memory (VM) – x86
By Dan Tsafrir, 30/5/2011Presentation based on slides by Lihu Rappoport
Computer Architecture 2011 – VM x862
http://www.youtube.com/watch?v=3ye2OXj32DM (funny beginning)
Computer Architecture 2011 – VM x863
Reminder: VM motivation
VM provides
– Illusion of large memory
– Illusion of contiguity
– Ability to overcommitment
– Process isolation
Computer Architecture 2011 – VM x864
Reminder: page table translates VA=>PA
Valid
1
Physical Memory
Disk
Page Tablepoints to memory
frame or disk address
1
1
1
1
1
11
1
0
0
0
Virtual page number
Think of it as a hash tablethat maps VA to PA
Computer Architecture 2011 – VM x865
Reminder: TLB accelerates translation
Valid
1
1
1
1
0
1
1
0
1
1
0
1
1
1
1
1
0
1
Physical Memory
Disk
Virtual page number
Page Table
Valid Tag Physical PageTLB
Physical PageOr
Disk Address
TLB is a VA => PAcache
Computer Architecture 2011 – VM x866
Reminder: VM concepts
A page can be
– Not yet loaded
– Loaded
– On disk A loaded page can be
– Dirty
– Clean When a page is not loaded (P bit clear) page fault occurs
– It may require throwing a loaded page to insert the new one OS prioritize throwing by LRU and dirty/clean/avail bits Dirty page should be written to Disk. Clean need not.
– New page is either loaded from disk or “initialized”
– CPU will set page “access” flag when accessed, “dirty” when written
Computer Architecture 2011 – VM x867
Goal
In the context of x86…
Provide a method to map
– From virtual address (used by program)
– To: physical address
Method should be efficient
– Can generally be exercised by HW alone
– Typically no SW involvement
Computer Architecture 2011 – VM x868
32BIT X86 REGULAR PAGING
Computer Architecture 2011 – VM x869
Hierarchical translation
x86 supports 4KB & 4MB pages
– Q: why would we want a 4MB (called “super-page”)?
– A: TLB is small…
Page directory
– Each process has its own page-directory (but threads share) CR3 points to p-d of current process
– Holds 1024 PDEs (page-directory entries), each is 32 bits
– Each PDE contains a PS (“page size”) flag PS=1: PDE points directly to a 4MB (super)page PS=0: PDE points to “page table” whose entries point to 4KB
pages
Page table
– Holds 1024 PTEs (page-table entries), each is 32 bits
– Each PTE points to a 4KB page in physical memory
Computer Architecture 2011 – VM x8610
Mapping only 4KB pages (typical)
2-level hierarchy– All pages are 4KB aligned– Total of 220 (=1M) 4KB pages = 4GB
DIR (10 bits)– Point to PDE in page directory– We assume all PDEs have PS=0– => Each PDE provides 20bit of 4KB-
aligned base physical address of a 4KB page table (no superpaging)
TABLE (10 bits)– Point to PTE in page table– PTE provides a 20 bit, 4KB-aligned
base physical address of a 4KB page OFFSET (12 bits)
– Offset within the selected 4KB page
031
DIR TABLE OFFSET
32bit linear address
1121
4KB 1K-PTEpage table4KB 1K-PDE
page directory
PDE
4K Page
data
CR3 (PDBR)
10 10 12
PTE
20+12=32 (4K aligned)
20
20
Computer Architecture 2011 – VM x8611
Mapping only 4MB pages
1-level hierarchy– All pages are 4MB aligned– Total of 210 (=1K) 4KB pages = 4GB
DIR (10 bits):– Point to PDE in page directory– We assume all PDEs have PS=1– => Each PDE provides 10bit of 4MB-
aligned base physical address of a 4MB page table (no superpaging)
TABLE (10 bits)– None! (moved to offset)
OFFSET (22 bits) – Offset within the selected 4MB page
Fine print– Must set PSE flag in CR4 for 4MB
support to work– Otherwise, PS=1 flag settings ignored
031
DIR OFFSET
32bit linear address
21
PDE
4MB Page
data
CR3 (PDBR)
10 22
20+12=32 (4K aligned)
10
4KB 1K-PDEpage directory
Computer Architecture 2011 – VM x8612
Mixing 4KB & 4MB pages
Works “out of the box”
– When CR3.PSE=1
– Alignment constraints: 4MB for superpages, 4KB for regular pages
TLB issues?
– No, as CPU maintains 4MB and 4KB PTEs in separate TLBs
Benefits
– Superpages often used for often-used kernel code
– Frees up 4KB TLB entries
– Reduces TLB misses => improve overall system performance
Computer Architecture 2011 – VM x8613
PDE & PTE format
20 bit physical address
– 4K-aligned pointer 12 bits flags
– Virtual memory Present, accessed,
dirty
– Protection Read, write, user,
privileged
– Caching WB, WT, disable
– 3 bit for OS usage
0
0 0
Page Frame Address 31:12 AVAIL 0 0 APCD
PWT
U W P
PresentWritableUserWrite-ThroughCache DisableAccessedPage Size (0: 4 Kbyte)Available for OS Use
Page DirEntry
04 12357911 681231
Page Frame Address 31:12 AVAIL D APCD
PWT
U W P
PresentWritableUserWrite-ThroughCache DisableAccessedDirtyAvailable for OS Use
Page TableEntry
04 12357911 681231
Reserved for future use (should be zero)
-
-
Computer Architecture 2011 – VM x8614
4KB-page PTE format
GP A T
Page Base Address 31:12 AVAIL D APCD
PWT
U/S
R/
WP
Present
Writable
User / Supervisor
Write-Through
Cache Disable
Accessed
Dirty
Page Table Attribute Index
Global Page
Available for OS Use
04 12357911 681231 -
Computer Architecture 2011 – VM x8615
4KB-page PDE format
GP SPage Table Base Address 31:12 AVAIL
A V L
APCD
PWT
U/S
R/
WP
Present
Writable
User / Supervisor
Write-Through
Cache Disable
Accessed
Dirty
Page Size (0 indicates 4 Kbytes)
Global Page (ignored)
Available for OS Use
04 12357911 681231 -
Computer Architecture 2011 – VM x8616
Reserved
4MB-page PDE format
GP S
Page BaseAddress 31:22
AVAIL D APCD
PWT
U/S
R/
WP
Present
Writable
User / Supervisor
Write-Through
Cache Disable
Accessed
Dirty
Page Size (1 indicates 4 Mbytes)
Global Page (ignored)
Available for OS Use
Page Table Attribute Index
04 12357911 681331 -22 21
P A T
12
Computer Architecture 2011 – VM x8617
VM attributes: present flag (P) Set => page in physical memory
– Translation is carried out by the MMU (memory management unit)
Clear => page not in physical memory
– When encounters by MMU => generates a page-fault exception
– Faulting address is available to SW exception handler
MMU does not set/clear this flag (only reads it)
– It’s up to the OS
Upon page-fault exception => OS typically does the following:
1.Copy page from disk to memory (unless already in buffer cache)
2.Update PTE/PDE with page RAM address
3.P = 1; dirty = accessed = 0; etc.
4.Invalidate associated PTE in TLB
5.Resume program on faulty instruction
Computer Architecture 2011 – VM x8618
VM attributes: page size flag (PS) In PDEs only
Determines the page size
– Clear => page size = 4KB (& PDE points to a page table)
– Set => page size = 4MB (& PDE points to superpage)
Computer Architecture 2011 – VM x8619
VM attributes: accessed (A) & dirty (D) MMU sets A-flag
– Upon first time a page (or page-table) is accessed (load or store) MMU sets D-flag
– Upon first time a page (or PT) is accessed (store only) A & D are sticky
– Once set, MMU (=HW) never clears them
– Only SW does OS clears them
– When initially loading PTE
– Possibly from time to time as part of LRU approximation (used to decide which pages to swap out and which to keep)
Computer Architecture 2011 – VM x8620
VM attributes: global flag (G) Has affect only when PGE=1 in CR4
When set, indicates page is “global”
– Not flushed from TLB when CR3 loaded
– Ignored for PDEs with PS=0 (that point to page tables)
Used to improve performance
– Keeps important pages of OS in TLB across context switches
Only software can set or clear this flag
Computer Architecture 2011 – VM x8621
Cache attributes: PWT
PWT
– Means “page-level write-through”
Controls write-through / write-back caching policy of page / PT
– 1: enable write-through caching
– 0 : disable write-through => enable write-back caching
Ignored if
– CD (“cache disable”) flag is set in CR0
– If associated PCD is on
Computer Architecture 2011 – VM x8622
Cache attributes: PCD
PCD
– Means “page-level cache disable” flag
Controls caching of individual pages / PTs
– 1: caching associated page/PT is prevented
– 0: caching allowed
Used
– When caching doesn’t help performance (e.g., streaming)
– Memory mapped I/O ports to communicate with devices
Assumed as set (regardless of actual value)
– If the CD (“cache disable”) flag in CR0 is set
Computer Architecture 2011 – VM x8623
Cache attributes: PAT
PAT
– Means “page attribute table index” flag
If on, used along with PCD & PWT flags to select an entry in the PAT
– Which in turn selects the memory type for the page
– PAT is a 64bit register
– (Not going into the details)
Computer Architecture 2011 – VM x8624
Protection attributes : R/W & U/S Read/write (R/W) flag
– Specifies read-write privileges for page (if PTE), group of pages (if PDE)
– 0 = read only
– 1 = read & write
User/supervisor (U/S) flag
– Specifies privileges for a page (PTE) or group of pages (PDE)(in case of a PDE that points to a page table)
– 0 = supervisor privilege level
– 1 = user privilege level
– User accessing a supervisor page will trigger an interrupt
Typically resulting in the termination of the program
Computer Architecture 2011 – VM x8625
Misc issues
Memory aliasing/sharing
– When two (or more) PDEs point to a common PTE
– When two (or more) PTEs point to a common page
– But SW must maintain consistency of accessed & dirty bits in the these PDEs & PTEs
Base address of page-directory
– Physical address of current p-d is stored in CR3 Also called the page-directory-base-register (PDBR)
– PDBR typically reloaded upon task switches
– Page directory must remain in-memory as long as task is active
Computer Architecture 2011 – VM x8626
32BIT X86 EXTENDED PAGING
Computer Architecture 2011 – VM x8627
PAE – Physical Address Extension
32bit address imposes a limit
– Means we can use memory <= 2^32 = 4GB
– Too small for many system,
PAE (physical address extension) support
– Allows access to a 2^36 RAM (= 64 GB)
– But not directly (address remains 32bit)
Only applicable when paging is enabled
– When also turning on PAE in CR4
– Support for 4KB and 2MB (rather than 4MB)
Computer Architecture 2011 – VM x8628
PAE – Physical Address Extension
Relies on an additional Page Directory Pointer Table
– Lies above the page directory in the translation hierarchy
– Has 4 entries of 64-bits each to support up to 4 page directories
– PTEs are increased to 64 bits to accommodate 36-bit base physical addresses
– Each 4KB page directory and page table can thus have up to 512 entries
– CR3 contains the page-directory-pointer-table base address
Computer Architecture 2011 – VM x8629
4KB Page Mapping with PAE
Linear address divided to
– Page-directory-pointer-table entry Indexed by bits 30:31 of the linear addr. Provides an offset to one of 4 entries in
the page-directory-pointer table The selected entry provides the base
physical address of a page directory
– Dir (9 bits) – points to a PDE in the Page Directory
PS in the PDE = 0 PDE provides a 27 bit, 4KB aligned base physical address of a page table
– Table (9 bit) – points to a PTE in the Page Table
PTE provides a 24 bit, 4KB aligned base physical address of a 4KB page
– Offset (12 bits) – offset within the selected 4KB page
029
DIR TABLE OFFSET
Linear Address Space (4K Page)
1120
512 entryPage Table512 entry
Page Directory
PDE
4KBytePage
data
9 9 12
PTE
CR3 (PDPTR)
32 (32B aligned)
24
27
1221Dir ptr
3031
4 entryPage
DirectoryPointerTable
Dir ptr entry27
2
Computer Architecture 2011 – VM x8630
2MB Page Mapping with PAE
Linear address divided to
– Page-directory-pointer-table entry Indexed by bits 30:31 of the linear addr. Provides an offset to one of 4 entries in
the page-directory-pointer table The selected entry provides the base
physical address of a page directory
– Dir (9 bits) – points to a PDE in the Page Directory
PS in the PDE = 1 PDE provides a 15 bit, 2MB aligned base physical address of a 2MB page
– Offset (21 bits) – offset within the selected 2MB page
029
DIR OFFSET
Linear Address Space (2MB Page)
20
Page Directory
PDE
2MBytePage
data
9 21
CR3 (PDPTR)
32 (32B aligned)
15
21Dir ptr
3031
Page DirectoryPointerTable
Dir ptr entry27
2
Computer Architecture 2011 – VM x8631
PTE/PDE/PDP Entry Format with PAE
The major differences in these entries are as follows:
– A page-directory-pointer-table entry is added
– The size of the entries is increased from 32 bits to 64 bits
– The maximum number of entries in a page directory or page table is 512
– The base physical address field in each entry is extended to 24 bits
Computer Architecture 2011 – VM x8632
Paging in 64 bit Mode
PAE paging structures expanded
– Potentially support mapping a 64-bit linear address to a 52-bit physical address
– First implementation supports mapping a 48-bit linear address into a 40-bit physical address
A 4th page mapping table added: the page map level 4 table (PML4)
– The base physical address of the PML4 is stored in CR3
– A PML4 entry contains the base physical address a page directory pointer table
The page directory pointer table is expanded to 512 8-byte entries
– Indexed by 9 bits of the linear address
The size of the PDE/PTE tables remains 512 eight-byte entries
– each indexed by nine linear-address bits
The total of linear-address index bits becomes 48
PS flag in PDEs selects between 4-KByte and 2-MByte page sizes
– CR4.PSE bit is ignored
Computer Architecture 2011 – VM x8633
sign ext.
4KB Page Mapping in 64 bit Mode
029
DIR TABLE OFFSET
Linear Address Space (4K Page)
1120
512 entryPage Table512 entry
Page Directory
PDE
4KBytePage
data
9 9 12
PTE
CR3 (PDPTR)
40 (4KB aligned)
28
31
12213038
512 entryPage
DirectoryPointerTable
PDP entry31
9
PDPPML4
394763
512 entryPML4Table
PML4 entry
9
31
Computer Architecture 2011 – VM x8634
sign ext.
2MB Page Mapping in 64 bit Mode
029
DIR OFFSET
Linear Address Space (2M Page)
20
512 entryPage
Directory
PDE
2MBytePage
data
9 21
CR3 (PDPTR)
40 (4KB aligned)
19
213038
512 entryPage
DirectoryPointerTable
PDP entry31
9
PDPPML4
394763
512 entryPML4Table
PML4 entry
9
31
Computer Architecture 2011 – VM x8635
PTE/PDE/PDP/PML4 Entry Format – 4KB Pages
Computer Architecture 2011 – VM x8636
TLBs
The processor saves most recently used PDEs and PTEs in TLBs
– Separate TLB for data and instruction caches
– Separate TLBs for 4-KByte and 2/4-MByte page sizes OS running at privilege level 0 can invalidate TLB entries
– INVLPG instruction invalidates a specific PTE in the TLB This instruction ignores the setting of the G flag
– Whenever a PDE/PTE is changed (including when the present flag is set to zero), OS must invalidate the corresponding TLB entry
– All (non-global) TLBs are automatically invalidated when CR3 is loaded
The global (G) flag prevents frequently used pages from being automatically invalidated in on a task switch
– The entry remains in the TLB indefinitely
– Only INVLPG can invalidate a global page entry