68

Caching Strategies for High Performance...SPE: Cache • • • • 0% 20% 40% 60% 80% 100% Cache size Ö Cumulative Distribution of Hits by Stack Depth

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

  • SPE: Cache

    •▫▫▫

    •▫▫▫▫

    11/7/2018

  • SPE: CacheSPE: Cache

  • SPE: CacheSPE: Cache

    • ∃ ••

  • SPE: CacheSPE: Cache 11/7/2018

  • SPE: CacheSPE: Cache

    Type Cache latency Backing store latency

    Processor memory CPU clock speed

    Virtual memory CPU clock speed 1-20 ms

    Disk controller 500 secs 1-20 ms

    Relational Database

    CPU clock speed 1-20 ms

    Web browser Direct file system access

    Network access

  • SPE: CacheSPE: Cache

    Type Request Chunk size

    Processor memory Address Cache line (e.g., 256 bytes)

    Virtual memory Address Page (e.g., 4KB, 1 MB)

    Disk controller Record Track

    Relational Database Block Block

    Web browser File File

  • SPE: CacheSPE: Cache

    ••

  • SPE: CacheSPE: Cache

  • SPE: CacheSPE: Cache

  • SPE: CacheSPE: Cache

    ••

    ••

    ••

  • SPE: CacheSPE: Cache

  • SPE: CacheSPE: Cache

    • 0%

    20%

    40%

    60%

    80%

    100%

    Cache size

    Cumulative Distribution of Hits by Stack Depth

  • SPE: CacheSPE: Cache

    ••

    0%

    20%

    40%

    60%

    80%

    100%

    Cache size

    Cumulative Distribution of Hits by Stack Depth

  • SPE: CacheSPE: Cache

    •▫▫▫

    0%

    20%

    40%

    60%

    80%

    100%

    Cache size

    Cumulative Distribution of Hits by Stack Depth

    Too small

    Too

    big

    Just right

  • SPE: CacheSPE: Cache

    0%

    20%

    40%

    60%

    80%

    100%

    Cache size

    Cumulative Distribution of Hits by Stack Depth

  • SPE: CacheSPE: Cache

    ••

  • SPE: CacheSPE: Cache

    • ➔

  • SPE: CacheSPE: Cache

    ഥ𝒘 = 𝒉𝒊𝒕 % ∗ ഥ𝒘𝒉𝒊𝒕𝒔 + ((𝟏 − 𝒉𝒊𝒕 %) ∗ ഥ𝒘𝒎𝒊𝒔𝒔𝒆𝒔)

    0 0.01 0.02 0.03

    Bi-modal service time distribution

    Cache misses Cache hits

  • SPE: CacheSPE: Cache

    0%

    20%

    40%

    60%

    80%

    100%

    Cache size

    Cumulative Distribution of Hits by Stack Depth

  • SPE: CacheSPE: Cache

    0%

    20%

    40%

    60%

    80%

    100%

    Cache size

    Cumulative Distribution of Hits by Stack Depth

  • SPE: CacheSPE: Cache

    0%

    20%

    40%

    60%

    80%

    100%

    Cache size

    Cumulative Distribution of Hits by Stack Depth

  • SPE: CacheSPE: Cache

    real memory : virtual memory

    0%

    20%

    40%

    60%

    80%

    100%

    Cache size

    Cumulative Distribution of Hits by Stack Depth

  • SPE: CacheSPE: Cache

    ••

  • SPE: CacheSPE: Cache

    ••

    ••

    ••

  • SPE: CacheSPE: Cache

    ••

  • SPE: CacheSPE: Cache

    ••

  • SPE: CacheSPE: Cache

    ••

    ••

  • SPE: CacheSPE: Cache

    0

    1

    2

    3

    4

    5

    6

    0 64 128 192 256

    Mis

    s %

    Cache size (KB)

    Cache size, associativity and Replacement Policy

    2-way LRU 2-way Random

    "4-way LRU" "4-way Random"

    "8-way LRU" "8-way Random"

  • SPE: CacheSPE: Cache

  • SPE: CacheSPE: Cache

  • SPE: CacheSPE: Cache

    •••

  • SPE: CacheSPE: Cache

  • SPE: CacheSPE: Cache

  • SPE: CacheSPE: Cache

    M Modified Dirty Data is stored in this cache

    E Exclusive Clean Data is stored in one cache

    S Shared Clean Data is stored in more than one cache

    I Invalid Data in the cache is invalid; upon access, the contents must be refreshed from the Primary store

    11/7/2018

  • SPE: CacheSPE: Cache

    ••

    spin_lock:mov eax, 1xchg eax, [lockword]test eax, eaxjnz spin_lockret

    spin_unlock:xor eax, eaxxchg eax, [lockword]ret

  • SPE: CacheSPE: Cache

    ••

    SHARED

    •spin_lock

    •xchg

    spin_unlock

  • SPE: CacheSPE: Cache

    •EXCLUSIVE

    INVALID

    •xchg

  • SPE: CacheSPE: Cache

  • SPE: CacheSPE: Cache

    SHARED

    SHARED

  • SPE: CacheSPE: Cache

    See https://software.intel.com/en-us/articles/avoiding-and-identifying-false-sharing-among-threads

    https://software.intel.com/en-us/articles/avoiding-and-identifying-false-sharing-among-threads

  • SPE: CacheSPE: Cache

    ••

  • SPE: CacheSPE: Cache

    ••

    ••

  • SPE: CacheSPE: Cache

    •▫

  • SPE: CacheSPE: Cache

  • SPE: CacheSPE: Cache

  • SPE: CacheSPE: Cache

    ••

  • SPE: CacheSPE: Cache

    •▫

  • SPE: CacheSPE: Cache

    https://software.intel.com/en-us/intel-vtune-amplifier-xe/

  • SPE: Cache

  • SPE: Cache

  • SPE: CacheSPE: Cache

    •▫

    Guest Machine A

    Guest Machine B

    Guest Machine C

    Guest Machine D

    Free space

  • SPE: CacheSPE: Cache

    Guest Machine A

    Guest Machine B

    Guest Machine C

    Guest Machine D

    Free space

  • SPE: CacheSPE: Cache

    Guest Machine A

    Guest Machine B

    Guest Machine C

    Guest Machine D

    Guest Machine EGuest Machine E

    Guest Machine E

  • SPE: CacheSPE: Cache

    •Guest Machine A

    Guest Machine B

    Guest Machine C

    Guest Machine D

    Guest Machine EGuest Machine E

    Guest Machine E

  • SPE: Cache

  • SPE: Cache

    State Value Free Memory

    Threshold

    Reclamation Action

    High 0 6% None

    Soft 1 < 6% Ballooning

    Hard 2 < 4% Swapping to Disk or Pages compressed

    Low 3 target allocations

    https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/perf-vsphere-memory_management.pdf

  • SPE: Cache

  • SPE: CacheSPE: Cache

  • SPE: CacheSPE: Cache

  • SPE: CacheSPE: Cache

  • SPE: Cache

    •▫

    ▫▫

  • SPE: Cache

    •▫

    ▫ Τ𝑽 𝑹)

    𝑔𝑢𝑒𝑠𝑡 𝑚𝑎𝑐ℎ𝑖𝑛𝑒 𝑪𝒐𝒎𝒎𝒊𝒕𝒕𝒆𝒅 𝑩𝒚𝒕𝒆𝒔 ∗ 100

    𝒄𝒖𝒓𝒓𝒆𝒏𝒕 𝒎𝒂𝒄𝒉𝒊𝒏𝒆 𝒎𝒆𝒎𝒐𝒓𝒚 𝒂𝒍𝒍𝒐𝒄𝒂𝒕𝒊𝒐𝒏

  • SPE: Cache

    •▫

    ▫𝑔𝑢𝑒𝑠𝑡 𝑚𝑎𝑐ℎ𝑖𝑛𝑒 𝑪𝒐𝒎𝒎𝒊𝒕𝒕𝒆𝒅 𝑩𝒚𝒕𝒆𝒔 ∗100

    𝒄𝒖𝒓𝒓𝒆𝒏𝒕 𝒎𝒂𝒄𝒉𝒊𝒏𝒆 𝒎𝒆𝒎𝒐𝒓𝒚 𝒂𝒍𝒍𝒐𝒄𝒂𝒕𝒊𝒐𝒏

  • SPE: Cache

    ▫▫

  • SPE: Cache

  • SPE: Cache

  • SPE: Cache

    https://www.vmware.com/pdf/usenix_resource_mgmt.pdfhttp://performancebydesign.blogspot.com/2013/07/virtual-memory-management-in-vmware.html