18
Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 Biswa@CSE-IITK

Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

Lecture-16 (Cache Replacement Policies)CS422-Spring 2018

Biswa@CSE-IITK

Page 2: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 2

From SPEC92 – Miss rate: Still Applicable Today

Cache Size (KB)

Mis

s R

ate

pe

r T

yp

e

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

1 2 4 8

16

32

64

12

8

1-way

2-way

4-way

8-way

Capacity

Compulsory

Page 3: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 3

2:1 Cache Rule (Why? Piazza +1, Before Next Lecture)

Cache Size (KB)

Mis

s R

ate

pe

r T

yp

e

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

1 2 4 8

16

32

64

12

8

1-way

2-way

4-way

8-way

Capacity

Compulsory

miss rate 1-way associative cache size X = miss rate 2-way associative cache size X/2

Page 4: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 4

Cache Replacement-101

Think of each block in a set having a “priority”

Indicating how important it is to keep the block in the cache

Key issue: How do you determine/adjust block priorities?

There are three key decisions in a set:

Insertion, promotion, eviction (replacement)

Insertion: What happens to priorities on a cache fill? Where to insert the incoming block, whether or not to insert the block

Promotion: What happens to priorities on a cache hit? Whether and how to change block priority

Eviction/replacement: What happens to priorities on a cache miss? Which block to evict and how to adjust priorities

Page 5: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 5

Eviction (Replacement) Policy?

Which block in the set to replace on a cache miss?

Any invalid block first

If all are valid, consult the replacement policy

Random

FIFO

Least recently used (how to implement?)

Not most recently used

Least frequently used?

Least costly to re-fetch?

Why would memory accesses have different cost?

Hybrid replacement policies

Optimal replacement policy?

Page 6: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 6

Belady

• Belady’s OPT• Replace the block that is going to be referenced furthest in the

future by the program• Belady, “A study of replacement algorithms for a virtual-

storage computer,” IBM Systems Journal, 1966.• How do we implement this? Simulate?

• Is this optimal for minimizing miss rate?

• Is this optimal for minimizing execution time?• No. Cache miss latency/cost varies from block to block!• Two reasons: Remote vs. local caches and miss overlapping• Qureshi et al. “A Case for MLP-Aware Cache

Replacement,“ ISCA 2006.

Page 7: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 7

Our Goal:

To minimize off-chip DRAM accesses

Page 8: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 8

LRU - 101

Cache Eviction Policy: On a miss (block i), which block to evict (replace) ?

Cache Insertion Policy: New block i inserted into MRU.

Cache Promotion Policy: On a future hit (block i), promote to MRU

a b c d e f g h

MRU LRUSET A

i a b c d e f g

MRU LRUSET A

LRU causes thrashing when working set > cache size

Page 9: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 9

Access Patterns

Recency friendly (a1, a2,….ak, ak-1, ….a2, a1)N

Thrashing (a1, a2,….ak)N [k > cache

size]

Streaming (a1, a2,….a∞)N

Combination of above three

Page 10: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 10

Types of Workloads – 4MB cache

Page 11: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 11

Limitations of LRU

LRU exploits temporal locality

Streaming data (a1, a2, a3,….a∞): No temporal locality, No temporal reuse

Thrashing data (a1, a2, a3,….an) [n>c] Temporal locality exists. However, LRU fails to capture.

Page 12: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 12

Bimodal Insertion Policy

if ( rand() < e ) e=1/16,1/32,1/64Insert at MRU position;

elseInsert at LRU position;

For small e: BIP retains thrashing protection of LRU insertion policy.

Infrequently insert lines in MRU position

Page 13: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 13

Dynamic Insertion Policy – Multicore:+2 in Piazza

SDM (LRU-sets)

Follower Sets

SDM (BIP-sets)

n-bit PSEL+

miss

–miss

MSB = 0?

YES No

Use LRU Use BIP

SDM – Set Dueling monitors PSEL – n-bit saturating counters for deciding a policy

Page 14: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 14

Still Miles to Go

References to non-temporal data (scans) discards frequently referenced working set

LLCsize

Working set larger than the cache causes thrashing

miss miss miss missmiss

hit hit hit miss hit hitmiss missscan scan scan

Wsize

Page 15: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 15

Still Miles to Go

hit hit hit hit hit mis

s

mis

s

mis

s

mis

s

mis

s

Working set larger than the cache Preserve some of working set in the cache

Recurring scans (bursts of non-temporal data) Preserve frequently referenced working set in the cache

hit hit hit hit hitscan scan scanhit hit hit

Wsize

LLCsize

Page 16: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 16

RRIP – ISCA ‘10

h g f e d c b a

MRU position LRU positionRRP head

0 1 2 3 4 5 6 7LRU chain

position stored with each cache

block

RRP tail

RRP: Re-reference prediction

Page 17: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 17

RRIP

RRP Head RRP Tail

Re-reference Prediction ValueRRPV (n=2): 3

‘distant’

2

‘far’

1

‘intermediate’

0

‘near-immediate’

Qualitative Prediction:

h g f e d c b a

Intuition: New cache block will not be re-referenced soon. Replaces block with distant RRPV.

Insert with RRPV=2, Evict with RRPV=3promote blocks with RRPV=0.

Static RRIP (Single core) and Thread-Aware Dynamic RRIP (SRRIP+BRRIP, multi-core, based on SDMs).

Page 18: Lecture-16 (Cache Replacement Policies) CS422-Spring 2018 · Eviction (Replacement) Policy? Which block in the set to replace on a cache miss? Any invalid block first If all are valid,

CS422: Spring 2018 Biswabandan Panda, CSE@IITK 18

DRRIP

0Imme-diate

1Inter-

mediate

2far

3distant

re-reference

No Victim

insertion

re-reference

evictionre-reference

No Victim

NoVictim

(SRRIP)Scan-Resistant

insertion

( BRRIP )Thrash-Resistant