ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE...

Preview:

Citation preview

ECE 254A

Advanced Computer Architecture: Supercomputers

Fall 2006

University of California, Santa Barbara

Department of Electrical and Computer Engineering

Project 1

“Designing a Simple Cache”

Ali Umut IRTURK 789139-3

ECE Department & ECON Department

Graduate Student

10/15/2006

1) The Basics of Memory Hierarchy and Overview of the Project

The memory system is organized as hierarchy which means a level closer to the processor is generally a subset of any level further away, and all the data is stored at the lowest level. The user has the illusion of a memory that is as large as the largest level of the hierarchy by implementing the it as a hierarchy, but it can be accessed as if it were all built from the fastest memory.

Figure 1.The basic structure of a memory hierarchy.

The aim of the first project is to “design a simple cache.” According to this general hierarchy, my design must be like Figure 2. The goal is to present the user with as much memory as is available in the cheapest technology, while providing access at the speed offered by the fastest memory.

Figure 2.The basic design of the project. The input and output ports are not designated at

this point.

As can be seen from the figure 2, I need to decide which input and output ports I need.

2) Discovering the Input and Output ports

I will consider every component one by one, and find these input and output ports. However, at this step I didn’t specify the length of the inputs and outputs.

a) Cpu:

When the information is needed from the cache or the information is needed to write, the cpu accesses to the cache. Thus; When any need of information is considered; i) The Cpu must inform this situation by “a read signal.” ii) The Cpu must inform where the data is by “address bits”. When writing is considered iii) The Cpu must inform this situation by “a write signal.” iv) The Cpu must inform which data is need to be written by “data bits”.

v) The Cpu must inform where the data will be written by “address bits”. (used in the any need of information process)

This shows that there must be 4 outputs from Cpu to the Cache (Cache inputs from Cpu). I

designated them using cpu_cac_NAME. Basically the read signal and write signal can be accomplished by 1 bit. However the address and data bits will be decided later.

b) Memory:

If a miss occurs in the Cache after Cpu’s request. The cache must access to the memory, for

retrieving data. Thus, memory needs an output to the Cache for transferring it to the cache: i) The requested data send by “data bits” from Memory to the Cache. I designated this

using mem_cac_data. The length of the data bits is considered later.

c) Cache:

The cache is the most important part of this design. There must be several outputs from Cache

to the Cpu and Memory. The relationship between Cache and Cpu

If the Cpu gives read signal and send the address of the data i) If the data requested by the processor appears in the Cache, this is called hit. First this

information must be given to Cpu by sending “a hit bit.” And the found data must be sent back to Cpu, so Cache needs an output to the Cpu to send “data bits”.

ii) If the data is not found in the Cache, the request is called a miss. The memory is then accessed to retrive the block containing the requested data. This information must be given to the Cpu by sending “a miss bit.”

This shows that there must be 3 outputs from Cache to the Cpu (Cpu inputs from Cache).

These are designated by cac_cpu_NAME. Basically the hit signal and miss signal can be accomplished by 1 bit. However the data bits must be considered later.

The relationship between Cache and Memory

As I mentioned before, if a miss occurs, the memory must be accessed for to retrieve the

desired block or if the Cpu wants to write information to the Memory using Cache, there must be several outputs form Cache to the Memory.

If a miss occurs in Cache i) This information must be given to Memory by sending “a read bit.” ii) The Cache must inform where the data is by “address bits”. If writing situation is considered iii) The Cache must inform this situation by “a write bit.” iv) The Cache must inform which data is need to be written by “data bits”.

This shows that there must be 4 outputs from Cache to the Memory (Memory inputs from Cache). They are designated as cac_mem_NAME. Basically the read bit and write bit can be accomplished by 1 bit. However the address and data bits must be considered. At this point, I know the general inputs and outputs which can be seen in Figure 3.

Figure 3. The inputs and outputs of the design.

At this point the things that I didn’t considered is how many bits must be the address and data bits, the signals which come from outside of the design like clock and reset bits and the cache structure.

3) Cache Architecture

In this simple cache design, I designed 8 by 8 cache. Each entry in the cache consists of 8 bits and there are 8 entries.

There are four important questions to answer at this point:

a. How do we know if a data item is in the cache?

b. If it is, how do we find it?

The answers to these two questions are related. If each 8 bits can go in exactly one place in

the cache, then it is straightforward to find this 8 bits if it is in the cache. The simplest way to assign a location in the cache for each 8 bits in memory is to assign the cache location based on the address of these 8 bits in the memory. This cache structure is called direct mapped, since each memory location is mapped directly to exactly one location in the cache. I used direct mapped cache structure in my design. The typical mapping between addresses and cache locations for a direct-mapped caches use the mapping (Block address) modulo (Number of cache blocks in the cache)

c. Because each cache location can contain the contents of a number of different memory

locations, how do we know whether the data in the cache corresponds to a requested

information? That is, how do we know whether a requested information is in the cache

or not?

The answer of this question is adding a set of tags to the cache. The tags contain the

address information required to identify whether an information in the cache corresponds to the requested information.

d. We also need a way to recognize that a cache block does not have valid information.

The most common method is to add a valid bit to indicate whether an entry contains a valid address. Basically, if the bit is not set, there cannot be a match for this block.

Thus;

If we consider the address output from the Cpu, since there are 8 blocks (entries) in the cache, there must be 3 bits of an address to give the block number. And a tag field which is used to compare with the value of the tag field of the cache. This is illustrated by figure 4.

Figure 4. The address which is sent by Cpu matches to the cache. Cpu Tag and Cpu Index

constructs the address which is sent by Cpu.

The index of a cache block, together with the tag contents of that block, uniquely

specifies the memory address of the information contained in the cache block. This shows us that we need 5 bits for address and data. This specifies the required information which we are seeking. Besides these, it is important that I need to add the clock and reset signals into the design.

As a result the following Figure shows the resulting design.

Figure 5. The resulting design of the project. The signals are specified in the interfaces part

which is below.

Interfaces

Connections Variable Name in Verilog Codes Bits

1 mem_cac_data 5

2 cac_mem_read 1

3 cac_mem_wrt 1

4 cac_mem_data 5

5 cac_mem_add 5

6 cpu_cac_read 1

7 cpu_cac_wrt 1

8 cpu_cac_data 5

9 cpu_cac_add 5

10 cac_cpu_hit 1

11 cac_cpu_miss 1

12 cac_cpu_data 5

Table 1. The names are given according to the usage of the signals.

Data Flow Diagrams

When I was discovering the input and output ports, and working on cache architecture, I gave information what happens when a read or write occurs. However, displaying using data flow diagrams is always very useful for better understanding.

Figure 6. Data Flow Diagram for Reading

Figure 7. Data Flow Diagram for Writing

Figure 8. State Flow diagram

4) Testbenchs and results Hit Test

1) Cpu requests the data (cac_cpu_data) from cache by giving an address(cpu_cac_add).

And sets the read bit (cpu_cac_read). 2) As we know, the address (cpu_cac_add) given by cache is 5 bits, and consists of cpu

tag (cpu_tag) and cpu index (cpu_index). 3) In this situation, cpu sends the address b10011 (cpu_cac_add), and set the read bit

(cpu_cac_read <= 1’b1). 4) As can be seen from the snapshot, cpu tag(cpu_tag) = 11, cpu index(cpu_index) = 100

which is true if we look at the address. 5) In the cache, the data in the index = 100 is 11100100. 6) Cache compare the current tag (cur_tag) at the index which is 11 and cpu

tag(cpu_tag). They match. And checks the valid bit which is 1 too. 7) Thus cache finds the requested data at the given cpu index(cpu_index). Sends a hit

signal to the cpu (cac_cpu_hit <= 1’b1).

8) At last, cache sends the required data b00100 which is true.

Figure 9. Snapshot of the Hit process

Figure 10. Snapshot of the Hit process

Miss Test

1) Cpu requests the data (cac_cpu_data) from cache by giving an address(cpu_cac_add).

And sets the read bit (cpu_cac_read). 2) As we know, the address (cpu_cac_add) given by cache is 5 bits, and consists of cpu

tag (cpu_tag) and cpu index (cpu_index). 3) In this situation, cpu sends the address b11010 (cpu_cac_add), and set the read bit

(cpu_cac_read <= 1’b1). 4) As can be seen from the snapshot, cpu tag(cpu_tag) = 10, cpu index(cpu_index) = 010

which is true if we look at the address. 5) In the cache, the data in the index = 100 is 10100010. 6) Cache compare the current tag (cur_tag) at the index which is 01 and cpu

tag(cpu_tag). They doen’t match. 7) Thus cache couldn’t find the requested data at the given cpu index(cpu_index). Sends a miss signal to the cpu (cac_cpu_miss <= 1’b1). 8) Cache needs to take the required data from memory.

Figure 11. Snapshot of the Miss process

Write Test

1) Cpu requests write a data (cpu_cac_data) from cache by giving an

address(cpu_cac_add) and data(cpu_cac_data). And sets the write bit (cpu_cac_wrt). 2) As we know, the address (cpu_cac_add) given by cache is 5 bits, and consists of cpu

tag (cpu_tag) and cpu index (cpu_index). 3) In this situation, cpu sends the address b11010 (cpu_cac_add), and set the write bit

(cpu_cac_wrt <= 1’b1). 4) Cache receives the data and the address, sends them to the memory using

cac_mem_data and cac_mem_add.(cac_mem_data <= 11111 cac_mem_add <= 11010). And cache sets the read bit for Memory (cac_mem_read <= 1’b1).

5) Cache replaces the new data (cac_mem_data) using the address(cac_mem_add) given.

Figure 12. Snapshot of the Write process

Miss Occurs, and the data is retrieved from memory

Basically, this part consists of combining the first three tests and adding the part – retrieving the data from memory. At the first three tests, hit, miss and write can be implemented in one state, however if we want to retrieve the data from memory, we need 3 states. These states are illustrated in Figure ?.

1) Cpu requests the data (cac_cpu_data) from cache by giving an address(cpu_cac_add). And sets the read bit (cpu_cac_read).

2) As we know, the address (cpu_cac_add) given by cache is 5 bits, and consists of cpu tag (cpu_tag) and cpu index (cpu_index).

3) In this situation, cpu sends the address b11010 (cpu_cac_add), and set the read bit (cpu_cac_read <= 1’b1).

4) As can be seen from the snapshot, cpu tag(cpu_tag) = 10, cpu index(cpu_index) = 010 which is true if we look at the address.

5) In the cache, the data in the index = 100 is 10100010. 6) Cache compare the current tag (cur_tag) at the index which is 01 and cpu

tag(cpu_tag). They doen’t match. 7) Thus cache couldn’t find the requested data at the given cpu index(cpu_index). Sends a miss signal to the cpu (cac_cpu_miss <= 1’b1). 8) Cache needs to take the required data from memory. Cache sets the read bit for Memory (cac_mem_read <= 1’b1), Cache sends the data to the memory by cac_mem_add. 9) Memory finds the data(mem_cac_data) using the address (cac_mem_add), cac_mem_add =11010 which is 26 in decimal. Mem_data[26] = b11010. So memory sends this data mem_data[26] using mem_cac_data. 10) After cache receives the desired data, it sends the data cpu using cac_cpu_data.

Figure 13. Snapshot of the retrieving the data from memory after a Miss and sending it to Cpu.

5) Codes

A) For Hit test

//Cache//

//***********************************// //Timescale// //***********************************// `timescale 1ns/100ps //***********************************// //Module// //***********************************// module cache( //Inputs from outside clock, //clock of the system rst_1, // Asynchronous active low reset if rst_1 = 0 Hardreset// //Inputs from Cpu// cpu_cac_read, // 1-bit indicates the read signal cpu_cac_wrt, // 1-bit indicates the write signal cpu_cac_data, // 5-bits data cpu_cac_add, // 5-bits address //Outputs from Cache cac_cpu_hit, // 1-bit indicates the hit cac_cpu_miss, // 1-bit indicates the miss cac_cpu_data // 5 bits data ); //************************************// //Ports// //************************************// // Inputs to Cache // input clock; input rst_1; input cpu_cac_read; input cpu_cac_wrt; input [0:4] cpu_cac_data;

input [0:4] cpu_cac_add; // Outputs from Cache// output cac_cpu_hit; output cac_cpu_miss; output [0:4] cac_cpu_data; //************************************// //Registers// //************************************// // Cpu // reg cac_cpu_hit; reg cac_cpu_miss; reg[0:4] cac_cpu_data; //Cache// reg[7:0] cache[0:7]; //************************************// //Wires// //************************************// wire[7:0] cpu_buf; wire[2:0] cpu_index; wire[1:0] cpu_tag; wire[1:0] cur_tag; //************************************// //Interface// //************************************// assign cpu_index = cpu_cac_add[0:2]; assign cpu_buf = cache[cpu_index]; assign cur_tag = cpu_buf[6:5]; assign cpu_tag = cpu_cac_add[3:4]; //************************************// // Work// //************************************//

always @(posedge clock or negedge rst_1) begin if (rst_1 == 0) begin //Store the initial values// //Outputs from Cache cac_cpu_hit <= 1'b0; // 1-bit indicates the hit cac_cpu_miss <= 1'b0; // 1-bit indicates the miss cac_cpu_data <= 5'b0; // 5 bits data // Store the cache with the initial values// cache[7] <= 8'b10000001; cache[6] <= 8'b10100010; cache[5] <= 8'b11000011; cache[4] <= 8'b11100100; cache[3] <= 8'b10000101; cache[2] <= 8'b10100110; cache[1] <= 8'b11000111; cache[0] <= 8'b11101000; end else begin if (cpu_cac_read == 1) begin if ((cpu_tag == cur_tag)&(cpu_buf[7] == 1'b1)) //Read Hit occurs begin $display("READ HIT"); cac_cpu_hit <= 1'b1; cac_cpu_miss <= 1'b0; cac_cpu_data <=cpu_buf[4:0]; end end end end endmodule

//Test bench for Hit//

`timescale 1ns/100ps module test_write(); //Registers//

reg clock, rst_1, cpu_cac_read, cpu_cac_wrt; reg[4:0] cpu_cac_data, cpu_cac_add; //Wires// wire cac_cpu_hit, cac_cpu_miss; wire[4:0] cac_cpu_data; //Instantiate Cache cache Cac(.clock(clock), .rst_1(rst_1), .cpu_cac_add(cpu_cac_add), .cpu_cac_read(cpu_cac_read), .cpu_cac_wrt(cpu_cac_wrt), .cpu_cac_data(cpu_cac_data), .cac_cpu_hit(cac_cpu_hit), .cac_cpu_miss(cac_cpu_miss), .cac_cpu_data(cac_cpu_data) ); //clock// always #2 clock <= ~clock; // Start initial begin clock <= 1'b0; rst_1 <= 1'b1; cpu_cac_read <= 1'b0; cpu_cac_wrt <= 1'b0; cpu_cac_add <= 5'b0; cpu_cac_data <= 5'b0; #5 rst_1 <=1'b0; #10 rst_1 <= 1'b1; #20

cpu_cac_read <= 1'b1; cpu_cac_add <= 5'b10011; #10 cpu_cac_read <= 1'b0; $stop;

end endmodule

A) ForMiss test

//Cache//

//***********************************// //Timescale// //***********************************// `timescale 1ns/100ps //***********************************// //Module// //***********************************// module cache( //Inputs from outside clock, //clock of the system rst_1, // Asynchronous active low reset if rst_1 = 0 Hardreset// //Inputs from Cpu// cpu_cac_read, // 1-bit indicates the read signal cpu_cac_wrt, // 1-bit indicates the write signal cpu_cac_data, // 5-bits data cpu_cac_add, // 5-bits address

//Outputs from Cache cac_cpu_hit, // 1-bit indicates the hit cac_cpu_miss, // 1-bit indicates the miss cac_cpu_data // 5 bits data ); //************************************// //Ports// //************************************// // Inputs to Cache // input clock; input rst_1; input cpu_cac_read; input cpu_cac_wrt; input [0:4] cpu_cac_data; input [0:4] cpu_cac_add; // Outputs from Cache// output cac_cpu_hit; output cac_cpu_miss; output [0:4] cac_cpu_data; //************************************// //Registers// //************************************// // Cpu // reg cac_cpu_hit; reg cac_cpu_miss; reg[0:4] cac_cpu_data; //Cache// reg[7:0] cache[0:7]; //************************************// //Wires// //************************************// wire[7:0] cpu_buf; wire[2:0] cpu_index; wire[1:0] cpu_tag;

wire[1:0] cur_tag; //************************************// //Interface// //************************************// assign cpu_index = cpu_cac_add[0:2]; assign cpu_buf = cache[cpu_index]; assign cur_tag = cpu_buf[6:5]; assign cpu_tag = cpu_cac_add[3:4]; //************************************// // Work// //************************************// always @(posedge clock or negedge rst_1) begin if (rst_1 == 0) begin //Store the initial values// //Outputs from Cache cac_cpu_hit <= 1'b0; // 1-bit indicates the hit cac_cpu_miss <= 1'b0; // 1-bit indicates the miss cac_cpu_data <= 5'b0; // 5 bits data // Store the cache with the initial values// cache[7] <= 8'b10000001; cache[6] <= 8'b10100010; cache[5] <= 8'b11000011; cache[4] <= 8'b11100100; cache[3] <= 8'b10000101; cache[2] <= 8'b10100110; cache[1] <= 8'b11000111; cache[0] <= 8'b11101000; end else begin if (cpu_cac_read == 1) begin

if ((cpu_tag == cur_tag)&(cpu_buf[7] == 1'b1)) //Read Hit occurs begin $display("READ HIT"); cac_cpu_hit <= 1'b1; cac_cpu_miss <= 1'b0; cac_cpu_data <=cpu_buf[4:0]; end else begin $display("READ MISS"); cac_cpu_hit <= 1'b0; cac_cpu_miss <= 1'b1; end end end end endmodule

//Test bench for Miss//

`timescale 1ns/100ps module test_write(); reg clock, rst_1, cpu_cac_read, cpu_cac_wrt; reg[4:0] cpu_cac_data, cpu_cac_add; wire cac_cpu_hit, cac_cpu_miss; wire[4:0] cac_cpu_data; //Instattiate Cache cache Cac(.clock(clock), .rst_1(rst_1), .cpu_cac_add(cpu_cac_add), .cpu_cac_read(cpu_cac_read), .cpu_cac_wrt(cpu_cac_wrt), .cpu_cac_data(cpu_cac_data), .cac_cpu_hit(cac_cpu_hit), .cac_cpu_miss(cac_cpu_miss), .cac_cpu_data(cac_cpu_data) );

// clock always #2 clock <= ~clock; // Start initial begin clock <= 1'b0; rst_1 <= 1'b1; cpu_cac_read <= 1'b0; cpu_cac_wrt <= 1'b0; cpu_cac_add <= 5'b0; cpu_cac_data <= 5'b0; #5 rst_1 <=1'b0; #10 rst_1 <= 1'b1; #20 cpu_cac_read <= 1'b1; cpu_cac_add <= 5'b11010; $stop #10 cpu_cac_read <= 1'b1; $stop; end endmodule

C) For Write test

//**********************************************// // Includes //*********************************************// `timescale 1ns/100ps //********************************************//

// Module Begin //*******************************************// module Cache( //Global Inputs clock, //System clock rst_l, //Asynchronous active low reset //Inputs from CPU cpu_cac_add, //5-bit address from CPU cpu_cac_read, //CPU Read to Cache cpu_cac_wrt, //CPU Write to Cache cpu_cac_data, //5-bit data from CPU //Outputs to CPU cac_cpu_hit, //Cache hit to CPU cac_cpu_miss,//Cache stall to CPU cac_cpu_data, //5-bit data to CPU //Outputs to Main Memory cac_mem_add, //5-bit address to Main Memory cac_mem_data, //5-bit data to Main Memory cac_mem_read, //Read signal to Main Memory cac_mem_wrt //Write signal to Main Memory ); //****************************************************// // Input Ports //***************************************************// //Global Inputs input clock; input rst_l; //CPU Inputs input [0:4]cpu_cac_data; input [0:4]cpu_cac_add; input cpu_cac_read; input cpu_cac_wrt; //****************************************************// // Output Ports //****************************************************//

//CPU Outputs output [4:0] cac_cpu_data; output cac_cpu_hit; output cac_cpu_miss; //Main Memory Outputs output [4:0] cac_mem_data; output [4:0] cac_mem_add; output cac_mem_read; output cac_mem_wrt; //*************************************************// // Register Variables //************************************************// //CPU registered outputs reg [4:0] cac_cpu_data; reg cac_cpu_hit; reg cac_cpu_miss; //Main Memory registered outputs reg [4:0] cac_mem_data; reg [4:0] cac_mem_add; reg cac_mem_read; reg cac_mem_wrt; //Cache Buffer reg [7:0] cache [0:7]; //************************************// //Wires// //************************************// wire[7:0] cpu_buf; wire[2:0] cpu_index; wire[1:0] cpu_tag; wire[1:0] cur_tag; wire[4:0] mem_addr; //************************************// //Interface//

//************************************// assign cpu_index = cpu_cac_add[0:2]; assign cpu_buf = cache[cpu_index]; assign cur_tag = cpu_buf[6:5]; assign cpu_tag = cpu_cac_add[3:4]; assign mem_addr = cpu_cac_add; //************************************// // Work// //************************************// always @(posedge clock or negedge rst_l) begin if (rst_l == 0) begin //Store the initial values// //Outputs from Cache cac_cpu_hit <= 1'b0; // 1-bit indicates the hit cac_cpu_miss <= 1'b0; // 1-bit indicates the miss cac_cpu_data <= 5'b0; // 5 bits data cac_mem_data <= 5'b0; cac_mem_add <= 5'b0; cac_mem_read <= 1'b0; cac_mem_wrt <= 1'b0; // Store the cache with the initial values// cache[7] <= 8'b10000001; cache[6] <= 8'b10100010; cache[5] <= 8'b11000011; cache[4] <= 8'b11100100; cache[3] <= 8'b10000101; cache[2] <= 8'b10100110; cache[1] <= 8'b11000111; cache[0] <= 8'b11101000; end else begin if (cpu_cac_read == 1) begin if ((cpu_tag == cur_tag)&(cpu_buf[7] == 1'b1)) //Read Hit occurs begin

$display("READ HIT"); cac_cpu_hit <= 1'b1; cac_cpu_miss <= 1'b0; cac_cpu_data <=cpu_buf[4:0]; end else begin $display("READ MISS"); cac_cpu_hit <= 1'b0; cac_cpu_miss <= 1'b1; end end if (cpu_cac_wrt == 1) //Write Occurs begin cac_mem_wrt <= 1'b1; cac_mem_add <= mem_addr; cac_mem_data <= cpu_cac_data; end end end endmodule//

//Memory//

//****************************// //Timescale// //****************************// `timescale 1ns/100ps //Memory Module// module memory(rst_l, cac_mem_add, cac_mem_data, cac_mem_wrt, cac_mem_read ); //Inputs input rst_l; input cac_mem_wrt; input cac_mem_read;

input[4:0] cac_mem_data; input[4:0] cac_mem_add; //Registers reg[10:0] mem_data[31:0]; //Start always @(negedge rst_l) begin if (rst_l == 0) begin mem_data[0] = 5'b00000; mem_data[1] = 5'b00001; mem_data[2] = 5'b00010; mem_data[3] = 5'b00011; mem_data[4] = 5'b00100; mem_data[5] = 5'b00101; mem_data[6] = 5'b00110; mem_data[7] = 5'b00111; mem_data[8] = 5'b01000; mem_data[9] = 5'b01001; mem_data[10] = 5'b01010; mem_data[11] = 5'b01011; mem_data[12] = 5'b01100; mem_data[13] = 5'b01101; mem_data[14] = 5'b01110; mem_data[15] = 5'b01111; mem_data[16] = 5'b10000; mem_data[17] = 5'b10001; mem_data[18] = 5'b10010; mem_data[19] = 5'b10011; mem_data[20] = 5'b10100; mem_data[21] = 5'b10101; mem_data[22] = 5'b10110; mem_data[23] = 5'b10111; mem_data[24] = 5'b11000; mem_data[25] = 5'b11001; mem_data[26] = 5'b11010; mem_data[27] = 5'b11011; mem_data[28] = 5'b11100; mem_data[29] = 5'b11101; mem_data[30] = 5'b11110;

mem_data[31] = 5'b11111; end else if (cac_mem_wrt==1'b1) //Memory write begin mem_data[cac_mem_add] <= cac_mem_data; end end endmodule

//Test bench for Write//

`timescale 1ns/100ps module test_write(); reg clock, rst_l, cpu_cac_read, cpu_cac_wrt; reg[4:0] cpu_cac_data, cpu_cac_add; wire cac_cpu_hit, cac_cpu_miss; wire[4:0] cac_cpu_data; //Instantiate Cache Cache Cac(.clock(clock), .rst_l(rst_l), .cpu_cac_add(cpu_cac_add), .cpu_cac_read(cpu_cac_read), .cpu_cac_wrt(cpu_cac_wrt), .cpu_cac_data(cpu_cac_data), .cac_cpu_hit(cac_cpu_hit), .cac_cpu_miss(cac_cpu_miss), .cac_cpu_data(cac_cpu_data), .cac_mem_read(cac_mem_read), .cac_mem_wrt(cac_mem_wrt),

.cac_mem_add(cac_mem_add), .cac_mem_data(cac_mem_data) ); memory MemoryMain(.rst_l(rst_l), .cac_mem_add(cac_mem_add), .cac_mem_data(cac_mem_data), .cac_mem_wrt(cac_mem_wrt), .cac_mem_read(cac_mem_read) ); // clock always #2 clock <= ~clock; // Start initial begin clock <= 1'b0; rst_l <= 1'b1; cpu_cac_read <= 1'b0; cpu_cac_wrt <= 1'b0; cpu_cac_add <= 5'b0; cpu_cac_data <= 5'b0; #5 rst_l <=1'b0; #10 rst_l <= 1'b1; #20 cpu_cac_wrt <= 1'b1; cpu_cac_add <= 5'b11010; cpu_cac_data <= 5'b11111; #5 cpu_cac_wrt <= 1'b0; $stop; end endmodule

D) For Retrieving the data from memory after a Miss and sending it to Cpu

//Cache//

//***********************************// //Timescale// //***********************************// `timescale 1ns/100ps //***********************************// //Module// //***********************************// module cache( //Inputs from outside clock, //clock of the system rst_1, // Asynchronous active low reset if rst_1 = 0 Hardreset// //Inputs from Cpu// cpu_cac_read, // 1-bit indicates the read signal cpu_cac_wrt, // 1-bit indicates the write signal cpu_cac_data, // 5-bits data cpu_cac_add, // 5-bits address //Inputs from Memory// mem_cac_data, // 5-bits data //Outputs from Cache to Cpu// cac_cpu_hit, // 1-bit indicates the hit cac_cpu_miss, // 1-bit indicates the miss cac_cpu_data, // 5 bits data //Outputs from Cache to Memory// cac_mem_read, // 1-bit indicates the read cac_mem_wrt, // 1-bit indicates the write cac_mem_add, // 5-bits address cac_mem_data // 5-bits data ); //************************************// //Ports// //************************************// // Inputs to Cache //

input clock; input rst_1; input cpu_cac_read; input cpu_cac_wrt; input [0:4] cpu_cac_data; input [0:4] cpu_cac_add; input mem_cac_data; // Outputs from Cache// output cac_cpu_hit; output cac_cpu_miss; output [0:4] cac_cpu_data; output [0:4] cac_mem_data; output [0:4] cac_mem_add; output cac_mem_read; output cac_mem_wrt; //************************************// //Registers// //************************************// // Cpu // reg cac_cpu_hit; reg cac_cpu_miss; reg[0:4] cac_cpu_data; //Memory// reg[0:4] cac_mem_add; reg[0:4] cac_mem_data; reg cac_mem_read; reg cac_mem_wrt; //State Diagram reg[2:0] state; //Cache// reg[7:0] cache[0:7]; parameter S0 = 0; parameter S1 = 1; parameter S2 = 2;

//************************************// //Wires// //************************************// wire[7:0] cpu_buf; wire[2:0] cpu_index; wire[1:0] cpu_tag; wire[1:0] cur_tag; wire[4:0] mem_add; wire[4:0] mem_data; wire[4:0] cache_data; //************************************// //Interface// //************************************// assign cpu_index = cpu_cac_add[0:2]; assign cpu_buf = cache[cpu_index]; assign cur_tag = cpu_buf[6:5]; assign cpu_tag = cpu_cac_add[3:4]; assign mem_add = cpu_cac_add; assign mem_data = mem_cac_data[0:4]; //************************************// // Work// //************************************// always @(posedge clock or negedge rst_1) begin if (rst_1 == 0) begin //Store the initial values// //Outputs from Cache cac_cpu_hit <= 1'b0; // 1-bit indicates the hit cac_cpu_miss <= 1'b0; // 1-bit indicates the miss cac_cpu_data <= 5'b0; // 5 bits data cac_mem_data <= 5'b0; cac_mem_read <= 1'b0; cac_mem_wrt <= 1'b0; cac_mem_add <= 1'b0; state <= S0;

// Store the cache with the initial values// cache[7] <= 8'b10000001; cache[6] <= 8'b10100010; cache[5] <= 8'b11000011; cache[4] <= 8'b11100100; cache[3] <= 8'b10000101; cache[2] <= 8'b10100110; cache[1] <= 8'b11000111; cache[0] <= 8'b11101000; end else begin case (state) // State 1// S0: if (cpu_cac_read == 1) begin cac_cpu_hit <= 1'b1; if ((cpu_tag == cur_tag)&(cpu_buf[7] == 1'b1)) //Read Hit occurs begin $display("READ HIT"); cac_cpu_hit <= 1'b1; cac_cpu_miss <= 1'b0; cac_cpu_data <=cpu_buf[4:0]; end else begin $display("READ MISS"); cac_cpu_hit <= 1'b0; cac_cpu_miss <= 1'b1; cac_mem_read <= 1'b1; cac_mem_add <= mem_add; state <= S1; end end // State 1 S1:// State 2 begin

cac_mem_read <=1'b0; cac_cpu_miss <= 1'b1; state <= S2; end //State 2 S2://State 3 begin cac_cpu_data <= mem_data; cac_mem_read <= 1'b0; cac_cpu_miss <= 1'b0; cache[cpu_index] <= {1,b1,cpu_tag, mem_cac_data}; state <= S0; end //State 3 endcase end end endmodule

//Memory//

//****************************// //Timescale// //****************************// `timescale 1ns/100ps //Memory Module// module memory(rst_1, cac_mem_add, cac_mem_data, cac_mem_wrt, cac_mem_read, mem_cac_data ); //Inputs input rst_1; input cac_mem_wrt; input cac_mem_read; input[4:0] cac_mem_data; input[4:0] cac_mem_add; //Output

output [4:0] mem_cac_data; //Registers reg [10:0] mem_data[31:0]; reg [4:0] mem_cac_data; //Start always @(negedge rst_1) begin if (rst_1 == 0) begin mem_data[0] = 5'b00000; mem_data[1] = 5'b00001; mem_data[2] = 5'b00010; mem_data[3] = 5'b00011; mem_data[4] = 5'b00100; mem_data[5] = 5'b00101; mem_data[6] = 5'b00110; mem_data[7] = 5'b00111; mem_data[8] = 5'b01000; mem_data[9] = 5'b01001; mem_data[10] = 5'b01010; mem_data[11] = 5'b01011; mem_data[12] = 5'b01100; mem_data[13] = 5'b01101; mem_data[14] = 5'b01110; mem_data[15] = 5'b01111; mem_data[16] = 5'b10000; mem_data[17] = 5'b10001; mem_data[18] = 5'b10010; mem_data[19] = 5'b10011; mem_data[20] = 5'b10100; mem_data[21] = 5'b10101; mem_data[22] = 5'b10110; mem_data[23] = 5'b10111; mem_data[24] = 5'b11000; mem_data[25] = 5'b11001; mem_data[26] = 5'b11010; mem_data[27] = 5'b11011; mem_data[28] = 5'b11100; mem_data[29] = 5'b11101; mem_data[30] = 5'b11110;

mem_data[31] = 5'b11111; end else begin if (cac_mem_wrt==1'b1) //Memory write begin mem_data[cac_mem_add] <= cac_mem_data; end if (cac_mem_read == 1'b1) //Memory read begin mem_cac_data <= mem_data[cac_mem_add]; end end end endmodule

//Testbench//

`timescale 1ns/100ps module test_write(); reg clock, rst_1, cpu_cac_read, cpu_cac_wrt; reg[4:0] cpu_cac_data, cpu_cac_add; wire cac_cpu_hit, cac_cpu_miss, cac_mem_read, cpu_mem_wrt; wire[4:0] cac_cpu_data, cac_mem_add, cac_mem_data, mem_cac_data; //Instantiate Cache cache Cac(.clock(clock), .rst_1(rst_1), .cpu_cac_read(cpu_cac_read), .cpu_cac_wrt(cpu_cac_wrt), .cpu_cac_data(cpu_cac_data), .cpu_cac_add(cpu_cac_add), .cac_cpu_hit(cac_cpu_hit), .cac_cpu_miss(cac_cpu_miss), .cac_cpu_data(cac_cpu_data), .cac_mem_read(cac_mem_read),

.cac_mem_wrt(cac_mem_wrt), .cac_mem_add(cac_mem_add), .cac_mem_data(cac_mem_data), mem_cac_data(mem_cac_data) ); memory Mem(.rst_1(rst_1), .cac_mem_add(cac_mem_add), .cac_mem_data(cac_mem_data), .cac_mem_wrt(cac_mem_wrt), .cac_mem_read(cac_mem_read), .mem_cac_data(mem_cac_data) ); // clock always #2 clock <= ~clock; // Start initial begin clock <= 1'b0; rst_1 <= 1'b1; cpu_cac_read <= 1'b0; cpu_cac_wrt <= 1'b0; cpu_cac_add <= 5'b0; cpu_cac_data <= 5'b0; #7 rst_1 <=1'b0; #10 rst_1 <= 1'b1; #20 cpu_cac_read <= 1'b1; cpu_cac_add <= 5'b11010; #10 cpu_cac_read <= 1'b0;

$stop; end endmodule

References

1) Computer Architecture “A Quantitative Approach,” John L. Hennessy & David A. Patterson

2) Computer Organization and Design, John L. Hennessy & David A. Patterson 3) Advanced Digital Design with the Verilog HDL, Michael D. Ciletti

Recommended