5
TUT 6 Q.1- A program consists of two nested loops   a small inner loop and a much larger outer loop. The general structure of the program is given in Figure P5.1. The decimal memory addresses shown delineate the location of the two loops and the beginning and end of the total program. All memory locations in the various sections, 17   22, 23   164, 165   239, and so on, contain instructions to be executed in straight-line sequencing. The program is to be run on a computer that has an instruction cache organized in the direct-mapped manner (see Figure 5.15) and that has the following  parameters Main memory size 64 K words Cache size 1 K words Block size 128 words The cycle time of the main memory is 10τ s, and the cycle time of the cache is 1τ s.  Figure P5.1 a) Specify the number of bits in the TAG, BLOCK and WORD fields in the main memory addresses.  b) Compute the total time needed for instruction fetching during execution of the  program in Figure P5.1 Solution a) A block contains 128 = 2 7 words => 7 bits Cache can contain 1K/128 = 8 = 2 3 blocks => 3 bits Main memory can contain 64K/128 = 512 blocks A block of cache can replace 512 / 8 = 64 = 2 6 blocks => 6 bits TAG BLOCK WORD 17 23 165 239 1200 1500 Start End Inner Loop 20 times Outer Loop 10 times

comp_arch

Embed Size (px)

Citation preview

Page 1: comp_arch

 

TUT 6Q.1- A program consists of two nested loops  – a small inner loop and a much larger 

outer loop. The general structure of the program is given in Figure P5.1. The decimal

memory addresses shown delineate the location of the two loops and the beginning

and end of the total program. All memory locations in the various sections, 17  – 22,

23  – 164, 165  – 239, and so on, contain instructions to be executed in straight-line

sequencing. The program is to be run on a computer that has an instruction cache

organized in the direct-mapped manner (see Figure 5.15) and that has the following

 parameters

Main memory size 64 K words

Cache size 1 K words

Block size 128 words

The cycle time of the main memory is 10τ s, and the cycle time of the cache is 1τ s.  

Figure P5.1

a) 

Specify the number of bits in the TAG, BLOCK and WORD fields in the main

memory addresses.

 b)  Compute the total time needed for instruction fetching during execution of the

 program in Figure P5.1

Solution

a)

A block contains 128 = 27 words => 7 bits

Cache can contain 1K/128 = 8 = 23 blocks => 3 bits

Main memory can contain 64K/128 = 512 blocks

A block of cache can replace 512 / 8 = 64 = 26 blocks => 6 bits

TAG BLOCK WORD

17

23

165

239

1200

1500

Start

End

Inner Loop

20 times

Outer Loop

10 times

Page 2: comp_arch

 

 

So the main memory address is6 3 7

Page 3: comp_arch

 

 b) Block Word

0  0 – 127 > 17, 23

128 – 255 > 165, 239

2  256 – 383

3  384 – 511

4  512 – 639

5  640 – 767

768 – 895

7  896 – 1,023

8  1,024 – 1,151

9  1,152 – 1,279 > 1,200

10  1,280 – 1,407

11 

1,408 – 1,535 > 1,500

EVENT WORD HIT/MISS ACTION

CACHE

Start: 17 cache miss add Block 0

count 1

18 – 22 cache hit

23 – 127 cache hit

128 cache miss add Block 1

count 2

129 – 164 cache hit

20 times 165 – 239 cache hit

240 – 255 cache hit

256 cache miss add Block 2

count 3

384 cache miss add Block 3

count 4

512 cache miss add Block 4

count 5

640 cache miss add Block 5

count 6

768 cache miss add Block 6

count 7

896 cache miss add Block 7

FULL

1,024 cache miss add Block 8 remove Block 0

FULL

1,152 cache miss add Block 9 remove Block 1

FULL

1,153 – 1,200 cache hit

9 times Outer Loop:

23 cache miss add Block 0 remove Block 8

FULL

24 – 127 cache hit

128 cache miss add Block 1 remove Block 9

FULL

129 – 164 cache hit

Page 4: comp_arch

 

20 times 165 – 239 cache hit

240 – 1,023 cache hit

1,024 cache miss add Block 8 remove Block 0

FULL

1,152 cache miss add Block 9 remove Block 1

FULL

1,153 – 1,200 cache hit

1,201 – 1,279 cache hit

1,280 cache miss add Block 10 remove Block 2

FULL

1,408 cache miss add Block 11 remove Block 3

FULL

So, Cache miss occurs 10 + (9 x 4) + 2 = 48 times

Use 48 x 10τ  = 480τ s, 

Cache hit occurs ((127  – 17) + (164  –  128) + 20(239  –  164) + (255  –  

239) + (383 – 256)

+ (511  –  384) + (639  –  512) + (767  – 640) + (895  –  

768) + (1,023 – 896)

+ (1,151 – 1,024) + (1,200 – 1,152)

+ 9((127 – 23) + (164 - 128) + 20(239  – 164) + (1,023 -

239)

+ (1,151 – 1,024) + (1,200 – 1,152))

+ (1,279 – 1,200) + (1,407 – 1,280) + (1,500 – 1,408)) x

1τ = 26,288τ s 

Total 480τ + 26,288τ = 26,768τ s. to execute the program.  

Page 5: comp_arch