31
331 W10.1 Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s UCB CS152 slides and Mary Jane Irwin’s PSU CSE331 slides]

331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.1 Spring 2005

14:332:331Computer Architecture and Assembly Language

Spring 2005

Week 10Building a Multi-Cycle Datapath

[Adapted from Dave Patterson’s UCB CS152 slides and

Mary Jane Irwin’s PSU CSE331 slides]

Page 2: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.2 Spring 2005

Head’s Up This week’s material

Multicycle MIPS datapath implementation- Reading assignment – PH 5.5 and C.3

Page 3: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.3 Spring 2005

Review: Single Cycle Data and Control Path

ReadAddress

Instr[31-0]

InstructionMemory

Add

PC

4

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

ovf

zero

RegWrite

DataMemory

Address

Write Data

Read Data

MemWrite

MemRead

SignExtend16 32

MemtoReg

ALUSrc

Shiftleft 2

Add

PCSrc

RegDst

ALUcontrol

1

1

1

00

0

0

1

ALUOp

Instr[5-0]

Instr[15-0]

Instr[25-21]

Instr[20-16]

Instr[15 -11]

ControlInstr[31-26]

Branch

Shiftleft 2

0

1

Jump

28Instr[25-0]

26PC+4[31-28]

32

Page 4: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.4 Spring 2005

Disadvantages of the Single Cycle Datapath

Uses the clock cycle inefficiently – the clock cycle must be timed to accommodate the slowest instruction

especially problematic for more complex instructions like floating point multiply

Is wasteful of area since some functional units must be duplicated since they can not be “shared” during an instruction execution

e.g., need separate adders to do PC update and branch target address calculations, as well as an ALU to do R-type arithmetic/logic operations and data memory address calculations

Page 5: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.5 Spring 2005

Multicycle Implementation Overview Each step in the execution takes 1 clock cycle

An instruction takes more than 1 clock cycle to complete

Not every instruction takes the same number of clock cycles to complete

Multicycle implementations allow functional units to be used more than once per instruction

as long as they are used on different clock cycles, as a result

- only need one memory

- need only one ALU/adder

faster clock rates different instructions to take a different number of clock

cycles

Page 6: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.6 Spring 2005

The Multicycle Datapath – A High Level View

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

AL

Uo

ut

Registers have to be added after every major functional unit to hold the output value until it is used in a subsequent clock cycle

Page 7: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.7 Spring 2005

Clocking the Multicycle Datapath

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

AL

Uo

ut

System Clock

MemWrite RegWrite

clock cycle

Page 8: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.8 Spring 2005

Break up the instructions into steps where each step takes a cycle while trying to balance the amount of work to be done in each step restrict each cycle to use only one major functional unit

At the end of a cycle Store values needed in a later cycle by the current

instruction in a state element (internal register) not visible to the programmer

IR – Instruction Register

MDR – Memory Data Register

A and B – register file read data registers

ALUout – ALU output register All (except IR) hold data only between a pair of adjacent

clock cycles (so don’t need a write control signal) Data used by subsequent instructions are stored in

programmer visible state elements (i.e., register file, PC, or memory)

Multicycle Approach

Page 9: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.9 Spring 2005

The Complete Multicycle Data with Control

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

AL

Uo

ut

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorD

PCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[3

1-2

6]

32

28

Page 10: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.10 Spring 2005

Reading/writing to any of the internal registers or the PC occurs (quickly) at

the end of a clock cycle reading/writing to the register file takes ~50% of a clock

cycle since it has additional control and access overhead (reading can be done in parallel with decode)

Have to add multiplexors in front of several of the functional unit inputs because the functional units are shared by different instruction cycles

All operations occurring in one step occur in parallel within the same clock cycle This limits us to one ALU operation, one memory access,

and one register file access per step (per clock cycle)

Multicycle Approach, con’t

Page 11: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.11 Spring 2005

Instruction Fetch

Instruction Decode and Register Fetch

R-type Instruction Execution, Memory Read/Write Address Computation, Branch Completion, or Jump Completion

Memory Read Access, Memory Write Completion or R-type Instruction Completion

Memory Read Completion (Write Back)

INSTRUCTIONS TAKE FROM 3 - 5 CYCLES!

Five Instruction Steps

Page 12: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.12 Spring 2005

Use PC to get instruction from the memory and put it in the Instruction Register

Increment the PC by 4 and put the result back in the PC

Can be described succinctly using RTL "Register-Transfer Language“

IR = Memory[PC]; PC = PC + 4;

Can we figure out the values of the control signals?

What is the advantage of updating the PC now?

Step 1: Instruction Fetch

Page 13: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.13 Spring 2005

Fetch Control Signals Settings

Start

Instr Fetch

Page 14: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.14 Spring 2005

Don’t know what the instruction is yet, so can only Read registers rs and rt in case we need them Compute the branch address in case the instruction is a

branch

RTL:

A = Reg[IR[25-21]];B = Reg[IR[20-16]];ALUOut = PC +(sign-extend(IR[15-0])<< 2);

Note we aren't setting any control lines based on the instruction (since we are busy "decoding" it in our control logic)

Step 2: Instruction Decode and Register Fetch

Page 15: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.15 Spring 2005

Datapath Activity During Instruction Decode

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

AL

Uo

ut

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorD

PCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[3

1-2

6]

32

28

Page 16: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.16 Spring 2005

Decode Control Signals Settings

Start

Instr Fetch Decode

Unless otherwise assigned

PCWrite,IRWrite, MemWrite,RegWrite=0 others=X

IorD=0MemRead;IRWrite

ALUSrcA=0ALUsrcB=01

PCSource,ALUOp=00PCWrite

Page 17: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.17 Spring 2005

ALU is performing one of four functions, based on instruction type

Memory reference (lw and sw):

ALUOut = A + sign-extend(IR[15-0]);

R-type:

ALUOut = A op B;

Branch:

if (A==B) PC = ALUOut;

Jump:

PC = PC[31-28] || (IR[25-0] << 2);

Step 3 (instruction dependent)

Page 18: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.18 Spring 2005

Datapath Activity During Instruction Execute

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

AL

Uo

ut

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorD

PCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[3

1-2

6]

32

28

Page 19: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.19 Spring 2005

Execute Control Signals Settings

Start

Instr Fetch Decode

Execute

(Op = R-

type)

(Op =

beq)

(Op = lw or

sw) (Op = j)

Unless otherwise assigned

PCWrite,IRWrite, MemWrite,RegWrite=0 others=X

ALUSrcA=0ALUSrcB=11ALUOp=00

PCWriteCond=0

IorD=0MemRead;IRWrite

ALUSrcA=0ALUsrcB=01

PCSource,ALUOp=00PCWrite

Page 20: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.20 Spring 2005

Memory reference:

MDR = Memory[ALUOut]; -- lwor

Memory[ALUOut] = B; -- sw

R-type instruction completion

Reg[IR[15-11]] = ALUOut;

Remember, the register write actually takes place at the end of the cycle on the clock edge

Step 4 (instruction dependent)

Page 21: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.21 Spring 2005

Datapath Activity During Memory Access

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

AL

Uo

ut

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorD

PCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[3

1-2

6]

32

28

Page 22: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.22 Spring 2005

Memory Access Control Signals Settings

Start

Instr Fetch Decode

Memory Access

Execute

(Op = R-

type)

(Op =

beq)

(Op = lw or

sw) (Op = j)

(Op = lw)(Op = sw)

Unless otherwise assigned

PCWrite,IRWrite, MemWrite,RegWrite=0 others=X

IorD=0MemRead;IRWrite

ALUSrcA=0ALUsrcB=01

PCSource,ALUOp=00PCWrite

ALUSrcA=0ALUSrcB=11ALUOp=00

PCWriteCond=0

ALUSrcA=1ALUSrcB=10ALUOp=00

PCWriteCond=0

ALUSrcA=1ALUSrcB=00ALUOp=10

PCWriteCond=0

ALUSrcA=1ALUSrcB=00ALUOp=01

PCSource=01PCWriteCond

PCSource=10PCWrite

Page 23: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.23 Spring 2005

All we have left is the write back into the register file the data just read from memory for lw instruction

Reg[IR[20-16]]= MDR;

What about all the other instructions?

Step 5: Memory Read Completion (Write Back)

Page 24: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.24 Spring 2005

Datapath Activity During Write Back

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

AL

Uo

ut

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorD

PCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[3

1-2

6]

32

28

Page 25: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.25 Spring 2005

Write Back Control Signals Settings

Start

Instr Fetch Decode

Write Back

Memory Access

Execute

(Op = R-

type)

(Op =

beq)

(Op = lw or

sw) (Op = j)

(Op = lw)(Op = sw)

Unless otherwise assigned

PCWrite,IRWrite, MemWrite,RegWrite=0 others=X

IorD=0MemRead;IRWrite

ALUSrcA=0ALUsrcB=01

PCSource,ALUOp=00PCWrite

ALUSrcA=0ALUSrcB=11ALUOp=00

PCWriteCond=0

ALUSrcA=1ALUSrcB=10ALUOp=00

PCWriteCond=0

ALUSrcA=1ALUSrcB=00ALUOp=10

PCWriteCond=0

ALUSrcA=1ALUSrcB=00ALUOp=01

PCSource=01PCWriteCond

PCSource=10PCWrite

RegDst=1RegWriteMemtoReg=0

PCWriteCond=0

MemWriteIorD=1

PCWriteCond=0

MemReadIorD=1

PCWriteCond=0

Page 26: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.26 Spring 2005

RTL Summary

Step R-type Mem Ref Branch Jump

Instr fetch

IR = Memory[PC]; PC = PC + 4;

Decode A = Reg[IR[25-21]];B = Reg[IR[20-16]];

ALUOut = PC +(sign-extend(IR[15-0])<< 2);

Execute

ALUOut = A op B;

ALUOut = A + sign-extend (IR[15-0]);

if (A==B) PC =

ALUOut;

PC = PC[31-28] ||(IR[25-0]

<< 2);

Memory access

Reg[IR[15-11]] = ALUOut;

MDR = Memory[ALUOut];

orMemory[ALUOut] = B;

Write-back

Reg[IR[20-16]] = MDR;

Page 27: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.27 Spring 2005

How many cycles will it take to execute this code?

lw $t2, 0($t3)lw $t3, 4($t3)beq $t2, $t3, Label #assume

notadd $t5, $t2, $t3sw $t5, 8($t3)

Label: ...

What is going on during the 8th cycle of execution?

In what cycle does the actual addition of $t2 and $t3 takes place?

In what cycle is the branch target address calculated?

Simple Questions

Page 28: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.28 Spring 2005

Multicycle datapath control signals are not determined solely by the bits in the instruction e.g., op code bits tell what operation the ALU should be

doing, but not what instruction cycle is to be done next

We’ll use a finite state machine for control a set of states (current state stored in State Register) next state function (determined

by current state and the input) output function (determined by

current state and the input)

We’ll use a Moore machine (so control signals based only on current state)

Multicycle Control

Combinationalcontrol logic

State RegInst

Opcode

Datapathcontrolpoints

Next State

. . .. . .

. . .

Page 29: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.29 Spring 2005

Finite State Machine Implementation

Combinationalcontrol logic

State RegInst[31-26]

NextState

Inputs

Out

puts

Op0

Op1

Op2

Op3

Op4

Op5

PCWritePCWriteCondIorDMemReadMemWriteIRWriteMemtoRegPCSourceALUOpALUSourceBALUSourceARegWriteRegDst

System Clock

Page 30: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.30 Spring 2005

Datapath Control Outputs Truth Table

Outputs Input Values (Current State[3-0])0000 0001 0010 0011 0100 0101 0110 0111 1000 1001

PCWrite 1 0 0 0 0 0 0 0 0 1

PCWriteCond

X 0 0 0 0 0 0 0 1 X

IorD 0 X X 1 X 1 X X X X

MemRead 1 X X 1 X X X X X X

MemWrite 0 0 0 0 0 1 0 0 0 0

IRWrite 1 0 0 0 0 0 0 0 0 0

MemtoReg X X X X 1 X X 0 X X

PCSource 00 XX XX XX XX XX XX XX 01 10

ALUOp 00 00 00 XX XX XX 10 XX 01 XX

ALUSrcB 01 11 10 XX XX XX 00 XX 00 XX

ALUSrcA 0 0 1 X X X 1 X 1 X

RegWrite 0 0 0 0 1 0 0 1 0 0

RegDst X X X X 0 X X 1 X X

Page 31: 331 W10.1Spring 2005 14:332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s

331 W10.31 Spring 2005

Next State Truth Table

Current State [3-

0]

Inst[31-26] (Op[5-0])

000000 (R-type)

000010 (jmp)

000100 (beq)

100011 (lw)

101011 (sw)

Any other

0000 0001 0001 0001 0001 0001 0001

0001 0110 1001 1000 0010 0010 illegal

0010 XXXX XXXX XXXX 0011 0101 illegal

0011 XXXX XXXX XXXX 0100 XXXX illegal

0100 XXXX XXXX XXXX 0000 XXXX illegal

0101 XXXX XXXX XXXX XXXX 0000 illegal

0110 0111 XXXX XXXX XXXX XXXX illegal

0111 0000 XXXX XXXX XXXX XXXX illegal

1000 XXXX XXXX 0000 XXXX XXXX illegal

1001 XXXX 0000 XXXX XXXX XXXX illegal