26
CMPT 250 Computer Architecture Instructor: Yuzhuang Hu [email protected]

Instructor: Yuzhuang Hu [email protected]. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load. A clock

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

CMPT 250 Computer Architecture

Instructor: Yuzhuang [email protected]

Page 2: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

The Shifter3 clock cycles will be needed if using a bidirectional shift

register with parallel load.A clock pulse loads the output of Bus B into the shift

register.Another clock pulse performs the shift.Another clock pulse transfer the result to the destination

register.

Page 3: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

A Faster Approach: Combinational ShiftersInput IR: right shift, IL: left shift. Output R: right

shift, L: left shift. B3

IR IL

S

Serialoutput L

Serialoutput R

2

B2 B1 B0

H0H1H2H3

SMUX

0 1 2S

MUX

0 1 2S

MUX

0 1 2S

MUX

0 1 2

Page 4: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

4-Bit Barrel ShifterDepending on S, the barrel shifter can shift

or rotate the input data by several bits.D3

S0

3 S1 S0

MUX

D2 D1 D0

Y0Y1Y2Y3

S1

012 3 S1 S0

MUX

012 3 S1 S0

MUX

012 3 S1 S0

MUX

012

Page 5: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Function Table for 4-Bit Barrel Shifter

Select Output Operation

S1 S0 Y3 Y2 Y1 Y0

0 0 D3 D2 D1 D0 No rotation

0 1 D2 D1 D0 D3 Rotate one position

1 0 D1 D0 D3 D2 Rotate two positions

1 1 D0 D3 D2 D1 Rotate three positions

Page 6: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Datapath RepresentationReduce the apparent

complexity of the datapath with a hierarchical structure.

The registers, and the multiplexer, decoder, and enable hardware for accessing them are encapsulated into a register file.

The ALU, shifter, Mux F and status bits are encapsulated into a function unit.

The details of the register file and the function unit are now at a lower design hierarchy.

Address outData out

Constant in

MB select

Bus ABus B

FSVCNZ

MD select

n

D dataWriteD address

A address B address

A data B data

2mx nRegister file

m

m m

n nn

nn

A B

Functionunit

F

4

MUX B1 0

MUX D0 1

n n Data in

Page 7: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

MD select 0 1MUX D

V

C

NZ

n

n

n

n

n

n

n

nn n

n

2 2

n

n

A data B data

Register file

1 0

MUX B AddressoutDataout

BusABus B

nn

Function unit

A B nG select

4

Zero Detect

MF select

nn

nF

MUX F

H select2

n

A BS2:0 || Cin

Arithmetic/logicunit (ALU)

G

BS

Shifter

H

MUX

0123

MUX

0123

0 1 2 3

Decoder

Load

Load

Load

Load

Load enable

WriteD data

D address2

Destination select

Constant in

MB select

A select

A address

B select

B address

R3

R2

R1

R0

Bus Dn

Data in

ILIR0 0

0 1

Page 8: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Register FileA set of registers having common micro-

operations performed on them may be organized into a register file.

The typical register file is a special type of fast memory that permits one or more words to be read or written, all simultaneously.

Page 9: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

G Select, H Select, and MF Select Codes Defined in Terms of FS

FS(3:0)MFSelect

GSelect(3:0)

HSelect(3:0) Micr ooperation

0000 0 0000 XX0001 0 0001 XX0010 0 0010 XX0011 0 0011 XX0100 0 0100 XX0101 0 0101 XX0110 0 0110 XX0111 0 0111 XX1000 0 1X00 XX1001 0 1X01 XX1010 0 1X10 XX1011 0 1X11 XX1100 1 XXXX 001101 1 XXXX 011110 1 XXXX 10

F A¬F A 1

F A B¬F A B 1¬F A B¬F A B 1¬F A 1-¬F A¬F A BÙ¬F A BÚ¬F A B¬F A¬F B¬F sr B¬F sl B¬

+

+ +++ +

Å

Page 10: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

The Control WordThere are 16 binary control inputs to the datapath. Their

combined values specify a control word.

Recall that DA: destination register address. AA and BA: the addresses of A and B operands. MB and MD: selects muxes B and D respectively. FS : function select for the function unit. RW : write to the register file.

Control word

DA AA BA MB

FS MD

RW

151413121110 9 8 7 6 5 4 3 2 1 0

Page 11: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Block Diagram

108

14

0

13

11

Bus D

Constant inn

n

MUX B1 0

D dataWrite

D address

A address B address

A data B data

8 x nRegister file

A B

Functionunit

n

nn

MUX D0 1

n nData in

Bus ABus B

RW

12AA

15DA

n

BA9

Address outData out

VCNZ

7

MD 1

MB 6

4 FS5

32

Page 12: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Examples of Microoperations for the Datapath: Symbolic Representation

Micr o-operatio n DA AA BA MB FS MD RW

R1 R2 R3 Register Function WriteR4 — R6 Register Function WriteR7 R7 — Register Function WriteR1 R0 — Constant Function Write—— R3 Register — — No Wr iteR4 —— — — Data in WriteR5 R0 R0 Register Function Write

R1 R2 R3–¬ F A B 1+ +=R4 sl R6¬ F sl B=R7 R7 1+¬ F A 1+=R1 R0 2+¬ F A B+=Data out R3¬R4 Data in¬R5 0¬ F A BÅ=

Page 13: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Examples of Microoperations for the Datapath: Binary Representation

Micro-operation DA AA BA MB FS MD RW

001 010 011 0 0101 0 1100 XXX 110 0 1110 0 1111 111 XXX 0 0001 0 1001 000 XXX 1 0010 0 1XXX XXX 011 0 XXX X X 0100 XXX XXX X XXX X 1 1101 000 000 0 1010 0 1

R1 R2 R3–¬R4 sl R6¬R7 R7 1+¬R1 R0 2+¬

Data out R3¬R4 Data in¬R5 0¬

Page 14: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Simulation of the Microoperation Sequence

1 4 7 1 0 4 5

2 0 7 0

3 6 0 3 0

X X

2 0 7 0

3 6 0 2 3 0

14 1 2 0 10

2 0 0 1 X

18 18

1 255 2

2

3

4 12 18

5 0

6

7 8

Clock

DA

1 4

AA

2

BA

3 6

Constant_in 2

MB

Address_out

Data_out

FS

5

Status_bits

Data_in

MD

RW

reg0 0

reg1

reg2

reg3

reg4

reg5

reg6

reg7

7 8

5

Page 15: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

A Simple Computer ArchitectureInstruction Set Architecture: defines the

boundary between hardware and software.

An instruction is a collection of bits that instructs the computer to perform a specific operation.

We call the collection of instructions for a computer its instruction set and a thorough description of the instruction set its instruction set architecture(ISA).

Page 16: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Storage ResourcesThe following diagram depicts the computer structure as

viewed by a user programming it in a language that directly specifies the instructions to be executed.

Instructionmemory215x 16

Datamemory215 x 16

Register file8x 16

Program counter(PC)

Page 17: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Three Instruction FormatsAn instruction consists of an operation code, several fields

about the operands, and possibly a field about the location to store the result.

(c) Jump and Branch

(a) Register

OpcodeDestinationregister (DR)

Source reg-ister A (SA)

Source reg-ister B (SB)

15 9 8 6 5 3 2 0

(b) Immediate

OpcodeDestinationregister (DR)

Source reg-ister A (SA)

15 9 8 6 5 3 2 0

Operand (OP)

OpcodeSource reg-ister A (SA)

15 9 8 6 5 3 2 0

Address (AD)(Right)

Address (AD)(Left)

Page 18: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Register Instructions

SA: Source Register A, SB: Source Register B,

DR: Destination Register.

Consider the instruction R1 <- R2 + R3. Here SA=R2, SB=R3, DR=R1.

(a) Register

OpcodeDestinationregister (DR)

Source reg-ister A (SA)

Source reg-ister B (SB)

15 9 8 6 5 3 2 0

Page 19: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Immediate Instructions

SA: source register A, DR: Destination register

OP: an immediate number.Consider the instruction R0 <- R1 + 3. Here

SA = R1, OP = 3, DR = R1.

(b) Immediate

OpcodeDestinationregister (DR)

Source reg-ister A (SA)

15 9 8 6 5 3 2 0

Operand (OP)

Page 20: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Jump and Branch Instructions

SA : source register A. AD left + AD right : a number with signed 2s

complement representation.Consider the instruction 1100000 101 110 100.

Here SA=R6, AD=-20. It is equivalent to “If R6=0, PC<-PC-20.”

(c) Jump and Branch

OpcodeSource reg-ister A (SA)

15 9 8 6 5 3 2 0

Address (AD)(Right)

Address (AD)(Left)

Page 21: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Instruction Specifications for the Simple Computer

TABLE 9-8Instruction Specificationsfor theSimpleComputer

Instruction OpcodeMne-monic Format Description

StatusBits

MoveA 0000000 MOVA RD, RA R[DR]← R[SA ]* N, ZIncrement 0000001 INC RD,RA R[DR]← R[SA ] + 1* N, ZA dd 0000010 A DD RD, RA, RB R[DR]← R[SA ] + R[SB]* N, ZSubtract 0000101 SUB RD, RA, RB R[DR]← R[SA ]− R[SB]* N, ZDecrement 0000110 DEC RD, RA R[DR]← R[SA ]− 1* N, ZA ND 0001000 A ND RD, RA, RB R[DR]← R[SA ]∧R[SB]* N, ZOR 0001001 OR RD, RA, RB R[DR]← R[SA ]∨R[SB]* N, ZExclusive OR 0001010 XOR RD, RA, RB R[DR]← R[SA ]⊕ R[SB]* N, ZNOT 0001011 NOT RD, RA R[DR]← * N, ZMoveB 0001100 MOVB RD, RB R[DR]← R[SB]*Shift Right 0001101 SHR RD, RB R[DR]← sr R[SB]*Shift Left 0001110 SHL RD, RB R[DR]← sl R[SB]*Load Immediate 1001100 LDI RD, OP R[DR]← zf OP*A dd Immediate 1000010 A DI RD, RA, OP R[DR]← R[SA ] + zf OP* N, ZLoad 0010000 LD RD, RA R[DR]← M[SA ]*Store 0100000 ST RA, RB M[SA ]← R[SB]*Branch on Zero 1100000 BRZ RA,AD if (R[SA ] = 0) PC← PC + seAD,

if (R[SA ]≠ 0) PC ← PC + 1N, Z

Branch on Negative

1100001 BRN RA,AD if (R[SA ] < 0) PC← PC + seAD,if (R[SA ]≥ 0) PC ← PC + 1

N, Z

Jump 1110000 JMP RA PC ← R[SA ]

* For all of theseinstructions,PC← PC +1isalsoexecutedtopreparefor thenextcycle

Page 22: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Memory Representation of Instructions and Data Memory Representationof Instructions andData

DeciimalAddress Mem ory Contents

DecimalOpcode Other Fields Operation

25 0000101 001 010011 5 (Subtract) DR:1,SA:2,SB:3 R1 ¬ R2 - R3

35 0100000 000 100101 32 (Store) SA:4, SB:5 M[R4] ¬ R5

45 1000010 010 111011 66 (AddImmediate)

DR:2, SA:7, OP:3 R2 ¬ R7+ 3

55 1100000 101 110100 96 (Branchon Zero )

AD:44, SA:6 If R6 =0,PC ¬ PC -20

70 00000000011000000 Data =192. After execution ofinstruction in35,Data =80.

Page 23: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Block Diagram for a Single-Cycle Computer

BusA Bus BAddress out

Data outMW

Data in

MUX B1 0

MUX D0 1

DATAPATH

RWDA

AA

Constantin

BA

MB

FSVCNZ

Functionunit

A B

F

MDBus D

IR(2:0)

Data in Address

Datamemory

Data out

DRegister

fileA B

Instructionmemory

Address

Instruction

Zero fill

DA

BA

AA

FS

MD

RW

MW

MB

Instruction decoder

JB

Extend

LP B

C

BranchControl

VCNZ

JBL

P BC

IR(8:6) || IR(2:0)

PC

CONTROL

JumpAddress

Page 24: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

Control Unit of the Single Cycle Simple Computer

We have described the design of its datapath.

The block diagram for this computer has a hardwired control unit that fetches and executes an instruction in a single clock cycle.

We do not write to the instruction memory, making it appear in this model to be a combinational rather than a sequential component.

Page 25: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

The Program Counter (PC)The PC provides the instruction address to the

instruction memory.

The PC is updated in each clock cycle. The behaviour of the PC is determined by the opcode, N, and Z.

PC Operation PL JB BC

Count Up 0 X X

Jump 1 1 X

Branch on Negative (else Count Up) 1 0 1

Branch on Zero (else Count Up) 1 0 0

Page 26: Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. The Shifter 3 clock cycles will be needed if using a bidirectional shift register with parallel load.  A clock

THANKS!