Upload
derrick-barrett
View
214
Download
2
Embed Size (px)
Citation preview
SpecTLB: A Mechanism for Speculative Address
Translation
2012. 06. 13Miseon Han
Thomas W. Barr, Alan L. Cox, Scott RixnerRice Computer Architecture Group, Rice Uni-
versityISCA, June 2011
Motivation
http://compiler.korea.ac.kr
• Virtual memory– Performance overhead 5-14% for ‘typical’ applications [Bhargava08]– 89% under virtualization [Bhargava08]– Large pages not always a good solution
Virtual Memory: Still an increasing challenge
3
http://compiler.korea.ac.kr
• What page size to pick?– 4KB, 2MB, 1GB on x86
• Can’t always use largest size– Wasted memory– increased I/O traffic
• Dynamic page size selection
Physical memory allocator – Large pages
4
http://compiler.korea.ac.kr
• SpecTLB (Speculative TLB)– A hardware/software system
• Reservation-based physical memory allocator [Talluri94]– Allocate small pages by default to maintain fine-grained control
• Predict small page translations in hardware– Performance of large pages, control of small pages
Ideas
5
Background
http://compiler.korea.ac.kr
• Four-level radix-tree page table
X86-64 Page Table format
7
0x5c8315cc2016
[47:39] [38:30] [29:21] [20:12] [11:0] {0b9, 00c, 0ae, 0c2, 016}
{123, 016}
http://compiler.korea.ac.kr
• Page table levels describe physical address space at different granularity
Large pages
8
512GB 1GB 2MB 4KB
http://compiler.korea.ac.kr
• Reservation-based memory allocation [Talluri94]– Always allocate small pages in book-keeping entry at first– Place these small pages in a large page ‘reservation’
• if the handler decides that reservation is needed– Promote reservation to large page
• when all small pages in the reservation are allocated– Extended and implemented in FreeBSD [Navarro02]
• Default memory allocator
Reservation-based memory allocation
9
http://compiler.korea.ac.kr
Reservation-based memory allocation
10
Handler reserves2MB region of physical space
http://compiler.korea.ac.kr
Reservation-based memory allocation
11
Reservation is ‘promoted’ intoLarge page.
http://compiler.korea.ac.kr
Reservation-based memory allocation
12
Reservations may not be filled.
http://compiler.korea.ac.kr
Reservation based memory allocation
13
SpecTLB
http://compiler.korea.ac.kr
• TLB-like structure– Tracks reservations, not actual mappings– Detect reservations– Predict translations– Verify predictions
SpecTLB
15
http://compiler.korea.ac.kr
Detecting reservations
16
{0b9, 00c, 0ae, 002, 313} {8002, 313}
Virtual Address Physical Address
{0b9, 00c, 0ae, 000, 000}
{8000, 000}Current Reservations:
{8000, 000}
http://compiler.korea.ac.kr
Predicting translations
17
{0b9, 00c, 0ae, 005, 313} {8005, 313}?
Virtual Address Physical Address
{0b9, 00c, 0ae, 000, 000}
{8000, 000}Current Reservations:
{8000, 000}
?
http://compiler.korea.ac.kr
• Provides predicted translations for pages within tracked reservations
• Predictions may be incorrect– Page table must still be walked
• Page walk can occur in parallel• Latency hidden
– Speculative translation can be used concurrently• Microarchitecture cancels speculative work
SpecTLB
18
Simulation & Result
http://compiler.korea.ac.kr
Benchmark TLB miss rate(/1k DRAM accesses)
Speculative Prediction frequency
Prediction Accuracy
DRAM Ac-cesses Overlapped
PostgreSQL 74.43 0.762 0.989 0.448
python 15.36 0.760 0.998 0.419
SPECjbb 20.04 0.418 0.971 0.310
bzip2 4.00 0.293 0.998 0.235
gcc 4.25 0.852 0.988 0.664
mcf 79.43 0.992 1.000 0.956
dc.B 42.29 0.083 0.353 0.073
ep.C 12.94 0.014 0.962 0.023
SpecTLB Results
20
Full system simulator, unmodified FreeBSD kernel
http://compiler.korea.ac.kr
• SpecTLB and TLB prefetching hide the latency of TLB misses.– SpecTLB : large-page reservations. current TLB miss.– TLB prefetcher : access patterns, future TLB miss.
• Speculative work– SpecTLB : instructions are executed parallel with translation confirm.– TLB prefetcher : prefetch page table entries.
TLB Prefetcher Comparison
21
http://compiler.korea.ac.kr
• Generally hidesfewer walks thanSpecTLB– Prefetcher does
well with high access regularity
TLB Prefetcher Comparison
22
Bench-mark
TLB miss rate
SpecTLB TLB Prefetcher
Post-greSQL
74.43 0.989 0.106
python 15.36 0.998 0.633
SPECjbb 20.04 0.971 0.151
bzip2 4.00 0.998 0.978
gcc 4.25 0.988 0.330
mcf 79.43 1.000 0.051
dc.B 42.29 0.353 0.190
ep.C 12.94 0.962 0.897
http://compiler.korea.ac.kr
• SpecTLB hides latency of TLB misses– Predictions allow page walk to occur in parallel with speculative work– >62% of TLB miss latencies hidden for majority of benchmarks
Conclusions
23