23
Lec 5 Memory Hierarchy and Cache Memory

Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X) Hit Rate : the fraction of memory access found in

Embed Size (px)

Citation preview

Page 1: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

Lec 5Memory Hierarchy and Cache

Memory

Page 2: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

Memory Hierarchy: Terminology Hit: data appears in some block in the upper level

(example: Block X) Hit Rate: the fraction of memory access found in the upper

level Hit Time: Time to access the upper level which consists of

RAM access time + Time to determine hit/miss Miss: data needs to be retrieve from a block in the lower level

(Block Y) Miss Rate = 1 - (Hit Rate) Miss Penalty: Time to replace a block in the upper level +

Time to deliver the block the processor Hit Time << Miss Penalty

Lower LevelMemoryUpper Level

MemoryTo Processor

From ProcessorBlk X

Blk Y

Page 3: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

General Principles of Memory Locality

Temporal Locality: referenced memory is likely to be referenced again soon (e.g. code within a loop)

Spatial Locality: memory close to referenced memory is likely to be referenced soon (e.g., data in a sequentially access array)

Definitions Upper: memory closer to processor Block: minimum unit that is present or not present Block address: location of block in memory Hit: Data is found in the desired location Hit time: time to access upper level Miss rate: percentage of time item not found in upper level

Locality + smaller HW is faster = memory hierarchy Levels: each smaller, faster, more expensive/byte than level

below Inclusive: data found in upper level also found in the lower

level

Page 4: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

What is Cache?“A computer memory with very short access

time used for storage of frequently used instructions or data” – webster.com

Page 5: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

Types of CacheL1/L2/L3 CacheRAM CacheDisk CacheSoftware Level Cache

Page 6: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

Memory Hierarchy Comparison

Page 7: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

L1/L2/L3 Cache (Cache Memory)Cache closer to the CPU that stores

recently accessed data from RAMHolds instructions to be executed next

and variables for the CPU

Page 8: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

L2 Cache

COAST: Cache on a Stick

Page 9: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

Cache Memory ContinuedLevel 1 Cache is stored on the CPUL2/L3 Cache stored near, but not on the CPUL1/L2/L3 Cache is more expensive than RAM

Page 10: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

Different CPU Caches:

Page 11: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

Cache MemoryReplacement Policies

CPU requests data from cached memoryCache memory has data, sends to CPU (Hit)RAM has data, sends to CPU (Miss)

Cache performance measured by hits/misses where the ideal algorithm has a ratio close to 1.

Page 12: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

Replacement PoliciesOn a miss the data from RAM is loaded in

to CacheDue to the nature of most programs

(Variables are accessed multiple times in the same program) this cached data will have a high chance of being accessed often.

One possible replacement algorithm is a FIFO based one where the oldest cached data is replaced first.

Page 13: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

CPU Cache - Write PoliciesCPU writes data in cache (hit)

Write-through – Cache and RAM updatedWrite-back – Cache updated, RAM updated

when new data is going to replace that data in cache Dirty bit keeps track of if a part of cache has

changed

Page 14: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

CPU Cache - Write PoliciesCPU writes but data not in cache (Miss)

Write-no-allocate – RAM updatedWrite-allocate – RAM updated, data loaded into

cache

Page 15: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

CPU Cache – SMP ProblemsMultiprocessor systems have trouble with

write-through and write-backIE. One processor updates data cached by both

processorsPossible Solutions:

Cache only unshared memory Update all cache simultaneously

Page 16: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

RAM CacheSame as Cache Memory or CPU Cache,

except instead of the L1/L2 cache caching RAM, RAM caches data from the hard disk.

For example, Windows loads multiple required libraries for running in memory, but code isn’t executed from them constantly.

Page 17: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

Disk CacheLike cache memory, disk cache holds data

recently read or written to from a hard disk. This is to speed up subsequent reads from recently accessed data.

Most newer hard drives have around 8mb of cache memory.

Page 18: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in
Page 19: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

Software CacheSoftware also needs to cache data into

RAM that it uses often – however, more often than not, this data isn’t directly from the hard disk.

The data software caches is often processed data from a file translated by the program into another data structure. The process of translating may take a long time, so the cache in RAM is used to speed up the rest of the programs run time.

Page 20: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

Software Cache - ExampleGames such as Half-Life and Unreal

Tournament 2004 have a loading time before each game where the map, models, sounds, and other important data are all loaded into RAM.

Example: Website loading lots of data from an SQL server each page load may take a few seconds, but if the first load caches all the data, the load may only take a few milliseconds.

Page 21: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

Software Cache - ExampleInternet Explorer

History – Keeps a cache of recently viewed websites for easy access

Cache – Keeps images and other files on disk so websites load faster the second time you visit them. These files are updated when there is a newer version of them on the server, or the cache is cleared.

Page 22: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in
Page 23: Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in

Interesting FactsThe first Celeron CPU had no caching,

however it performed almost as fast as the PII with cache.

Because some chipsets couldn’t cache more than the first 64mb of RAM a rumor was spread that Windows 98 could only use up to 64mb of RAM, however in reality it supports up to 2GB of RAM.In these chipsets, adding more than 64MB of

RAM decreased performance.