Caching Strategies for High Performance...SPE: Cache • • • • 0% 20% 40% 60% 80% 100% Cache size Ö Cumulative Distribution of Hits by Stack Depth

SPE: Cache

•▫▫▫

•▫▫▫▫

•

11/7/2018

SPE: CacheSPE: Cache

•

•

•

•


•

• ∃ ••

•

•

•

SPE: CacheSPE: Cache 11/7/2018


•

Type Cache latency Backing store latency

Processor memory CPU clock speed

Virtual memory CPU clock speed 1-20 ms

Disk controller 500 secs 1-20 ms

Relational Database

CPU clock speed 1-20 ms

Web browser Direct file system access

Network access


•

•

Type Request Chunk size

Processor memory Address Cache line (e.g., 256 bytes)

Virtual memory Address Page (e.g., 4KB, 1 MB)

Disk controller Record Track

Relational Database Block Block

Web browser File File


•

•

•

•

•

••

•

•


•

•

•

•

•

•

•

•


•

•

•

•

•

•


•

•

•

••

••

•

••


•

•

•

•

•

•

•

•

•

•

•

•


•

•

•

• 0%

20%

40%

60%

80%

100%

Cache size

Cumulative Distribution of Hits by Stack Depth


•

••

•

•

•

•

•

•

0%

20%

40%

60%

80%

100%

Cache size



•

•

▫

•▫▫▫

•

0%

20%

40%

60%

80%

100%

Cache size


Too small

Too

big

Just right


•

▫

▫

▫

0%

20%

40%

60%

80%

100%

Cache size



••

•

•

•


•

•

• ➔

•

•


•

•

ഥ𝒘 = 𝒉𝒊𝒕 % ∗ ഥ𝒘𝒉𝒊𝒕𝒔 + ((𝟏 − 𝒉𝒊𝒕 %) ∗ ഥ𝒘𝒎𝒊𝒔𝒔𝒆𝒔)

•

▫

▫

0 0.01 0.02 0.03

Bi-modal service time distribution

Cache misses Cache hits


•

•

•

•

•

•

0%

20%

40%

60%

80%

100%

Cache size



•

•

0%

20%

40%

60%

80%

100%

Cache size



•

•

•

•

•

0%

20%

40%

60%

80%

100%

Cache size



•

real memory : virtual memory

•

•

0%

20%

40%

60%

80%

100%

Cache size



••

•

•

•


••

•

••

••

•


••

•

•

•

•


•

•

•

••

•


•

••

•

•

•

••

•

•


•

0

1

2

3

4

5

6

0 64 128 192 256

Mis

s %

Cache size (KB)

Cache size, associativity and Replacement Policy

2-way LRU 2-way Random

"4-way LRU" "4-way Random"

"8-way LRU" "8-way Random"


•

•

•

•


•

•

•


•

•

•

•

•

•

•••


•

•

•


•

•

•

•

•

•


•

M Modified Dirty Data is stored in this cache

E Exclusive Clean Data is stored in one cache

S Shared Clean Data is stored in more than one cache

I Invalid Data in the cache is invalid; upon access, the contents must be refreshed from the Primary store

11/7/2018


••

spin_lock:mov eax, 1xchg eax, [lockword]test eax, eaxjnz spin_lockret

spin_unlock:xor eax, eaxxchg eax, [lockword]ret


••

•

SHARED

•spin_lock

•xchg

spin_unlock


•

•EXCLUSIVE

•

INVALID

•xchg


•

•

•


•

•

SHARED

•

SHARED


•

•

See https://software.intel.com/en-us/articles/avoiding-and-identifying-false-sharing-among-threads

https://software.intel.com/en-us/articles/avoiding-and-identifying-false-sharing-among-threads


•

•

••

•

•


••

•

••

•

•


•▫


•

•


•


•

••

•

•

•

•


•

•▫


•

▫

▫

▫

▫

•

https://software.intel.com/en-us/intel-vtune-amplifier-xe/

SPE: Cache

•

▫

▫

▫

SPE: Cache

•


•

•▫

•

Guest Machine A

Guest Machine B

Guest Machine C

Guest Machine D

Free space


•

▫

•

▫

Guest Machine A

Guest Machine B

Guest Machine C

Guest Machine D

Free space


•

▫

▫

Guest Machine A

Guest Machine B

Guest Machine C

Guest Machine D

Guest Machine EGuest Machine E

Guest Machine E


•

•Guest Machine A

Guest Machine B

Guest Machine C

Guest Machine D

Guest Machine EGuest Machine E

Guest Machine E

SPE: Cache

•

▫

SPE: Cache

•

•

State Value Free Memory

Threshold

Reclamation Action

High 0 6% None

Soft 1 < 6% Ballooning

Hard 2 < 4% Swapping to Disk or Pages compressed

Low 3 target allocations

https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/perf-vsphere-memory_management.pdf

SPE: Cache

•

•

•


•

SPE: Cache

•▫

▫

▫

▫▫

SPE: Cache

•▫

▫ Τ𝑽 𝑹)

𝑔𝑢𝑒𝑠𝑡 𝑚𝑎𝑐ℎ𝑖𝑛𝑒 𝑪𝒐𝒎𝒎𝒊𝒕𝒕𝒆𝒅 𝑩𝒚𝒕𝒆𝒔 ∗ 100

𝒄𝒖𝒓𝒓𝒆𝒏𝒕 𝒎𝒂𝒄𝒉𝒊𝒏𝒆 𝒎𝒆𝒎𝒐𝒓𝒚 𝒂𝒍𝒍𝒐𝒄𝒂𝒕𝒊𝒐𝒏

▫

SPE: Cache

•▫

▫𝑔𝑢𝑒𝑠𝑡 𝑚𝑎𝑐ℎ𝑖𝑛𝑒 𝑪𝒐𝒎𝒎𝒊𝒕𝒕𝒆𝒅 𝑩𝒚𝒕𝒆𝒔 ∗100

𝒄𝒖𝒓𝒓𝒆𝒏𝒕 𝒎𝒂𝒄𝒉𝒊𝒏𝒆 𝒎𝒆𝒎𝒐𝒓𝒚 𝒂𝒍𝒍𝒐𝒄𝒂𝒕𝒊𝒐𝒏

▫

▫

SPE: Cache

•

▫

▫

•

▫▫

SPE: Cache

SPE: Cache

•

•

•

•
https://www.vmware.com/pdf/usenix_resource_mgmt.pdfhttp://performancebydesign.blogspot.com/2013/07/virtual-memory-management-in-vmware.html

Documents

Caching Strategies for High Performance...SPE: Cache • • • • 0% 20% 40% 60% 80% 100% Cache size Ö Cumulative Distribution of Hits by Stack Depth