24
CSE410, Winter 2017 L18: Caches II Caches II CSE 410 Winter 2017 Instructor: Teaching Assistants: Justin Hsia Kathryn Chan, Kevin Bi, Ryan Wong, Waylon Huang, Xinyu Sui IBM’s hub for wearables could shorten your hospital stay Researchers at IBM have developed a hub for wearables that can gather information from multiple wearable devices. The gadget, [called ‘Chiyo’], funnels data from devices such as smart watches and fitness bands into the IBM Cloud. There, it's analyzed and the results are shared with the user and their doctor. IBM isn't planning to get into the wearables business. Instead, it plans to offer the service as a platform on which other companies can build their own health services. http://www.pcworld.com/article/3170153/wearables/ ibms‐hub‐for‐wearables‐could‐have‐you‐out‐of‐the‐hospital‐faster.html

Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Caches IICSE 410 Winter 2017

Instructor:  Teaching Assistants:Justin Hsia Kathryn Chan, Kevin Bi, Ryan Wong, Waylon Huang, Xinyu Sui

IBM’s hub for wearables could shorten your hospital stayResearchers at IBM have developed a hub for wearables that can gather information from multiple wearable devices.The gadget, [called ‘Chiyo’], funnels data from devices such as smart watches and fitness bands into the IBM Cloud. There, it's analyzed and the results are shared with the user and their doctor.IBM isn't planning to get into the wearables business. Instead, it plans to offer the service as a platform on which other companies can build their own health services.

• http://www.pcworld.com/article/3170153/wearables/ibms‐hub‐for‐wearables‐could‐have‐you‐out‐of‐the‐hospital‐faster.html

Page 2: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Administrivia

Lab 3 due Thursday (2/23) Homework 4 released today (Structs, Caches)

Mid‐Quarter Survey Feedback Pace is “moderate” to “a bit too fast” and course is more than 3 units of work You talk too fast in lecture (or rush at the end) and I wish there were more peer instruction questions Canvas quiz answer keys are annoying, but instant homework feedback is great Sections: “shorten discussions and lengthen explanations” 

2

Page 3: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Review:  Example Memory Hierarchy

3

registers

on‐chip L1cache (SRAM)

main memory(DRAM)

local secondary storage(local disks)

Larger,  slower, cheaper per byte

remote secondary storage(distributed file systems, web servers)

Local disks hold files retrieved from disks on remote network servers

Main memory holds disk blocks retrieved from local disks

off‐chip L2cache (SRAM)

L1 cache holds cache lines retrieved from L2 cache

CPU registers hold words retrieved from L1 cache

L2 cache holds cache lines retrieved from main memory

Smaller,faster,costlierper byte

Page 4: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Making memory accesses fast!

Cache basics Principle of locality Memory hierarchies Cache organization Direct‐mapped (sets; index + tag) Associativity (ways) Replacement policy Handling writes

Program optimizations that consider caches

4

Page 5: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Cache Organization (1)

Block Size ( ):  unit of transfer between  and Mem Given in bytes and always a power of 2 (e.g. 64 B) Blocks consist of adjacent bytes (differ in address by 1)

• Spatial locality!

5

Note: The textbook uses “B” for block size

Page 6: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Cache Organization (1)

Block Size ( ):  unit of transfer between  and Mem Given in bytes and always a power of 2 (e.g. 64 B) Blocks consist of adjacent bytes (differ in address by 1)

• Spatial locality!

Offset field  Low‐order  bits of address tell you which byte within a block• (address) mod 2 =  lowest bits of address

(address) modulo (# of bytes in a block)

6

Block Number Block Offset‐bit address:(refers to byte in memory)

bitsbits

Note: The textbook uses “b” for offset bits

Page 7: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Cache Organization (2)

Cache Size ( ):  amount of data the  can store Cache can only hold so much data (subset of next level) Given in bytes ( ) or number of blocks ( ) Example:   = 32 KiB = 512 blocks if using 64‐B blocks

Where should data go in the cache? We need a mapping from memory addresses to specific locations in the cache to make checking the cache for an address fast

What is a data structure that provides fast lookup? Hash table!

7

Page 8: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Review:  Hash Tables for Fast Lookup

8

0123456789

Insert:52734102119

Apply hash function to map data to “buckets”

Page 9: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Place Data in Cache by Hashing Address

Map to cache index from block address Use next  bits  (block address) mod (# blocks in cache)

9

Block Addr Block Data0000000100100011010001010110011110001001101010111100110111101111

Memory CacheIndex Block Data00011011

Here  = 4 Band  / = 4

Page 10: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Place Data in Cache by Hashing Address

Map to cache index from block address Lets adjacent blocks fit in cache simultaneously!• Consecutive blocks go in consecutive cache indices

10

Block Addr Block Data0000000100100011010001010110011110001001101010111100110111101111

Memory CacheIndex Block Data00011011

Here  = 4 Band  / = 4

Page 11: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Place Data in Cache by Hashing Address

Collision! This might confuse the cache later when we access the data Solution?

11

Block Addr Block Data0000000100100011010001010110011110001001101010111100110111101111

Memory CacheIndex Block Data00011011

Here  = 4 Band  / = 4

Page 12: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Tags Differentiate Blocks in Same Index

Tag = rest of address bits bits =  Check this during a cache lookup

12

Block Addr Block Data0000000100100011010001010110011110001001101010111100110111101111

Memory CacheIndex Tag Block Data00 000110 0111 01

Here  = 4 Band  / = 4

Page 13: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Checking for a Requested Address

CPU sends address request for chunk of data Address and requested data are not the same thing!

• Analogy:  your friend ≠ his or her phone number

TIO address breakdown:

Index field tells you where to look in cache Tag field lets you check that data is the block you want Offset field selects specified start byte within block

Note: and  sizes will change based on hash function13

Tag ( ) Offset ( )‐bit address:

Block Number

Index ( )

Page 14: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Cache Puzzle #1

Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known as a cold cache Access (addr: hit/miss) stream:

• (14: miss), (15: hit), (16: miss)

A. 4 bytesB. 8 bytesC. 16 bytesD. 32 bytesE. We’re lost…

14

Vote at http://PollEv.com/justinh

Page 15: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Direct‐Mapped Cache

Hash function:  (block address) mod (# of blocks in cache) Each memory address maps to exactly one index in the cache Fast (and simpler) to find an address

15

Block Addr Block Data00 0000 0100 1000 1101 0001 0101 1001 1110 0010 0110 1010 1111 0011 0111 1011 11

Memory CacheIndex Tag Block Data00 0001 1110 0111 01

Here  = 4 Band  / = 4

Page 16: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Direct‐Mapped Cache Problem

What happens if we access the following addresses? 8, 24, 8, 24, 8, …? Conflict in cache (misses!) Rest of cache goes unused

Solution?

16

Block Addr Block Data00 0000 0100 1000 1101 0001 0101 1001 1110 0010 0110 1010 1111 0011 0111 1011 11

Memory CacheIndex Tag Block Data00 ??01 ??1011 ??

Here  = 4 Band  / = 4

Page 17: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Associativity

What if we could store data in any place in the cache? More complicated hardware = more power consumed, slower

So we combine the two ideas: Each address maps to exactly one set Each set can store block in more than one way

17

01234567

0

1

2

3

Set

0

1

Set

1-way:8 sets,

1 block each

2-way:4 sets,

2 blocks each

4-way:2 sets,

4 blocks each

0

Set

8-way:1 set,

8 blocks

direct mapped fully associative

Page 18: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Cache Organization (3)

Associativity ( ):  # of ways for each set Such a cache is called an “ ‐way set associative cache” We now index into cache sets, of which there are  Use lowest  =  bits of block address

• Direct‐mapped: = 1, so  = log / as we saw previously• Fully associative: =  / , so  = 0 bits

18

Decreasing associativityFully associative(only one set)Direct mapped

(only one way)

Increasing associativity

Selects the setUsed for tag comparison Selects the byte from block

Tag ( ) Index ( ) Offset ( )

Note: The textbook uses “b” for offset bits

Page 19: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Example Placement

Where would data from address 0x1833 be placed? Binary:  0b 0001 1000 0011 0011

19

= ? 

block size: 16 Bcapacity: 8 blocksaddress: 16 bits

Set Tag Data01234567

Direct‐mappedSet Tag Data

0

1

2

3

Set Tag Data

0

1

2‐way set associative 4‐way set associative

Tag ( ) Offset ( )‐bit address: Index ( )

= log / / = log=  – –

= ?  = ? 

Page 20: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Example Placement

Where would data from address 0x1833 be placed? Binary:  0b 0001 1000 0011 0011

20

= 3

block size: 16 Bcapacity: 8 blocksaddress: 16 bits

Set Tag Data01234567

Direct‐mappedSet Tag Data

0

1

2

3

Set Tag Data

0

1

2‐way set associative 4‐way set associative= 2 = 1

Tag ( ) Offset ( )‐bit address: Index ( )

= log / / = log=  – –

Page 21: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Block Replacement

Any empty block in the correct set may be used to store block If there are no empty blocks, which one should we replace? No choice for direct‐mapped caches Caches typically use something close to least recently used (LRU)

(hardware usually implements “not most recently used”)

21

Set Tag Data01234567

Direct‐mappedSet Tag Data

0

1

2

3

Set Tag Data

0

1

2‐way set associative 4‐way set associative

Page 22: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Cache Puzzle #2

What can you infer from the following behavior? Cache starts empty, also known as a cold cache Access (addr: hit/miss) stream:

• (10: miss), (12: miss), (10: miss)

Associativity?

Number of sets?

22

Page 23: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

General Cache Organization ( ,  ,  )

23

= blocks/lines per set

= # sets= 2

set

“line” (block plusmanagement bits)

0 1 2 K‐1TagV

valid bit= bytes per block

Cache size:data bytes

(doesn’t include V or Tag)

Page 24: Caches II - University of Washington...Cache Puzzle#1 Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known

CSE410, Winter 2017L18: Caches II

Notation Review

We just introduced a lot of new variable names! Please be mindful of block size notation when you look at past exam questions or are watching videos

24

Variable This Quarter Formulas

Block size inbook

2 ↔ log2 ↔ log2 ↔ log

log / /

Cache size

Associativity

Number of Sets

Address space

Address width

Tag field width

Index field width

Offset field width inbook