113
Skup instrukcija mikropocesora MIPS

Skup instrukcija mikropocesora MIPS

  • Upload
    bing

  • View
    49

  • Download
    0

Embed Size (px)

DESCRIPTION

Skup instrukcija mikropocesora MIPS. MIPS register convention. Register 1, called $at, is reserved for the assembler, and registers 26-27, called $k0-$k1, are reserved for the operationg system. MIPS operands. A Small Set of Instructions. - PowerPoint PPT Presentation

Citation preview

Page 1: Skup instrukcija  mikropocesora MIPS

Skup instrukcija

mikropocesora MIPS

Page 2: Skup instrukcija  mikropocesora MIPS

MIPS register convention

Register 1, called $at, is reserved for the assembler, and registers 26-27, called $k0-$k1, are reserved for the operationg system.

Name Register number

Usage Preserved on call?

$zero 0 the constant value 0 n.a.

$v0-$v1

2-3 values for results and expression evalution

no

$a0-$a3

4-7 arguments yes

$t0-$t7

8-15 temporaries no

$s0-$s7

16-23 saved yes

$t8-$t9

24-15 more tempporaries no

$gp 28 global pointer yes

$sp 29 stack pointer yes

$fp 30 frame pointer yes

$ra 31 return address yes

Page 3: Skup instrukcija  mikropocesora MIPS

MIPS operands

Name Example Comments

32 registers

$s0-$s7, $t0-$t9, $gp, $fp, $zero, $sp, $ra, $at, Hi, Lo

Fast locations for data, In MIPS, data must be in registers to perform arithmetic. MIPS register $zero always equals 0. Register $at is reserved for the assumbler to handle large constants. Hi and Lo contain the results of multiply and divide.

230 memory words

Memory [0], Memory [4], ..., Memory [4294967292]

Accessed only by data transfer instructions. MIPS uses byte addresses, so sequential words differ by 4. Memory holds data structures, such as arrays, and spilled registers, such as those saved on procedure calls.

Page 4: Skup instrukcija  mikropocesora MIPS

Fig. 13.1 MicroMIPS instruction formats and naming of the various fields.

5 bits 5 bits

31 25 20 15 0

Opcode Source 1 or base

Source 2 or dest’n

op rs rt

R 6 bits 5 bits

rd

5 bits

sh

6 bits

10 5 fn

jta Jump target address, 26 bits

imm Operand / Offset, 16 bits

Destination Unused Opcode ext I

J

inst Instruction, 32 bits

Seven R-format ALU instructions (add, sub, slt, and, or, xor, nor)Six I-format ALU instructions (lui, addi, slti, andi, ori, xori)Two I-format memory access instructions (lw, sw)Three I-format conditional branch instructions (bltz, beq, bne)Four unconditional jump instructions (j, jr, jal, syscall)

We will refer to this diagram later

A Small Set of Instructions

Page 5: Skup instrukcija  mikropocesora MIPS

Instruction UsageLoad upper immediate lui rt,imm

Add  add rd,rs,rt

Subtract sub rd,rs,rt

Set less than slt rd,rs,rt

Add immediate  addi rt,rs,imm

Set less than immediate slti rd,rs,imm

AND and rd,rs,rt

OR or rd,rs,rt

XOR xor rd,rs,rt

NOR nor rd,rs,rt

AND immediate andi rt,rs,imm

OR immediate ori rt,rs,imm

XOR immediate xori rt,rs,imm

Load word lw rt,imm(rs)

Store word sw rt,imm(rs)

Jump  j L

Jump register jr rs

Branch less than 0 bltz rs,L

Branch equal beq rs,rt,L

Branch not equal  bne rs,rt,L

Jump and link jal L

System call  syscall

Copy

Control transfer

Logic

Arithmetic

Memory access

op15

0008

100000

1213143543

2014530

fn

323442

36373839

8

12Table

The MicroMIPS Instruction Set

Page 6: Skup instrukcija  mikropocesora MIPS

13.2 The Instruction Execution Unit

Fig. 13.2 Abstract view of the instruction execution unit for MicroMIPS. For naming of instruction fields, see Fig. 13.1.

ALU

Data cache

Instr cache

Next addr

Control

Reg file

op

jta

fn

inst

imm

rs,rt,rd (rs)

(rt)

Address

Data

PC

5 bits 5 bits

31 25 20 15 0

Opcode Source 1 or base

Source 2 or dest’n

op rs rt

R 6 bits 5 bits

rd

5 bits

sh

6 bits

10 5 fn

jta Jump target address, 26 bits

imm Operand / Offset, 16 bits

Destination Unused Opcode ext I

J

inst Instruction, 32 bits

bltz,jr

beq,bne

12 A/L, lui, lw,sw

j,jal

syscall

22 instructions

Page 7: Skup instrukcija  mikropocesora MIPS

13.3 A Single-Cycle Data Path

Fig. 13.3 Key elements of the single-cycle MicroMIPS data path.

/

ALU

Data cache

Instr cache

Next addr

Reg file

op

jta

fn

inst

imm

rs (rs)

(rt)

Data addr

Data in 0

1

ALUSrc ALUFunc DataWrite

DataRead

SE

RegInSrc

rt

rd

RegDst RegWrite

32 / 16

Register input

Data out

Func

ALUOvfl

Ovfl

31

0 1 2

Next PC

Incr PC

(PC)

Br&Jump

ALU out

PC

0 1 2

Instruction fetch Reg access / decode ALU operation Data access

Register writeback

Page 8: Skup instrukcija  mikropocesora MIPS

An ALU for MicroMIPS

Fig. 10.19 A multifunction ALU with 8 control signals (2 for function class, 1 arithmetic, 3 shift, 2 logic) specifying the operation.

AddSub

x y

y

x

Adder

c 32

c 0

k /

Shifter

Logic unit

s

Logic function

Amount

5

2

Constant amount

Variable amount

5

5

ConstVar

0

1

0

1

2

3

Function class

2

Shift function

5 LSBs Shifted y

32

32

32

2

c 31

32-input NOR

Ovfl Zero

32

32

MSB

ALU

y

x

s

Shorthand symbol for ALU

Ovfl Zero

Func

Control

0 or 1

AND 00 OR 01

XOR 10 NOR 11

00 Shift 01 Set less 10 Arithmetic 11 Logic

00 No shift 01 Logical left 10 Logical right 11 Arith right

lui

imm

Page 9: Skup instrukcija  mikropocesora MIPS

13.4 Branching and Jumping

Fig. 13.4 Next-address logic for MicroMIPS (see top part of Fig. 13.3). Adder

jta imm

(rs)

(rt)

SE

SysCallAddr

PCSrc

(PC)

Branch condition checker

in c

1 0 1 2 3

/ 30

/ 32 BrTrue / 32

/ 30 / 30

/ 30

/ 30

/ 30

/ 30

/ 26

/ 30

/ 30 4 MSBs

30 MSBs

BrType

IncrPC

NextPC

/ 30 31:2

16

(PC)31:2 + 1 Default option

(PC)31:2 + 1 + imm When instruction is branch and condition is met

(PC)31:28 | jta When instruction is j or jal

(rs)31:2 When the instruction is jr SysCallAddr Start address of an operating system routine

Update options for PC

Lowest 2 bits of PC always 00

4 MSBs

Page 10: Skup instrukcija  mikropocesora MIPS

13.5 Deriving the Control SignalsTable 13.2 Control signals for the single-cycle MicroMIPS implementation.

Control signal 0 1 2 3

RegWrite Don’t write Write

RegDst1, RegDst0 rt rd $31

RegInSrc1, RegInSrc0 Data out ALU out IncrPC  

ALUSrc (rt ) imm    

AddSub Add Subtract  

LogicFn1, LogicFn0 AND OR XOR NOR

FnClass1, FnClass0 lui Set less Arithmetic Logic

DataRead Don’t read Read    

DataWrite Don’t write Write    

BrType1, BrType0 No branch beq bne bltz

PCSrc1, PCSrc0 IncrPC jta (rs) SysCallAddr

Reg file

Data cache

Next addr

ALU

Page 11: Skup instrukcija  mikropocesora MIPS

Control Signal

Settings

Table 13.3

Load upper immediate Add Subtract Set less than Add immediate Set less than immediate AND OR XOR NOR AND immediate OR immediate XOR immediate Load word Store word Jump Jump register Branch on less than 0 Branch on equal Branch on not equal Jump and link System call

001111 000000 100000 000000 100010 000000 101010 001000 001010 000000 100100 000000 100101 000000 100110 000000 100111 001100 001101 001110 100011 101011 000010 000000 001000 000001 000100 000101 000011 000000 001100

1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0

op fn

00 01 01 01 00 00 01 01 01 01 00 00 00 00

10

01 01 01 01 01 01 01 01 01 01 01 01 01 00

10

1 0 0 0 1 1 0 0 0 0 1 1 1 1 1

0 1 1 0 1 0 0

00 01 10 11 00 01 10

00 10 10 01 10 01 11 11 11 11 11 11 11 10 10

0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

11 0110 00

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 10 00 00 00 01 11

Instruction Reg

Writ

e

Reg

Dst

Reg

InS

rc

ALU

Src

Add

’Sub

Logi

cFn

FnC

lass

Dat

aR

ead

Dat

aWrit

e

BrT

ype

PC

Src

Page 12: Skup instrukcija  mikropocesora MIPS

Instruction Decoding

Fig. 13.5 Instruction decoder for MicroMIPS built of two 6-to-64 decoders.

jrInst

norInst

sltInst

orInst

xorInst

syscallInst

andInst

addInst

subInst

RtypeInst

bltzInst jInst jalInst beqInst bneInst

sltiInst

andiInst oriInst

xoriInst luiInst

lwInst

swInst

addiInst

1

0

1 2

3

4 5

10

12 13

14

15

35

43

63

8 o

p D

eco

de

r

fn D

eco

de

r

/ 6 / 6 op fn

0

8

12

32

34

36 37

38

39

42

63

Page 13: Skup instrukcija  mikropocesora MIPS

Control Signal Generation

Auxiliary signals identifying instruction classes

arithInst = addInst subInst sltInst addiInst sltiInst

logicInst = andInst orInst xorInst norInst andiInst oriInst xoriInst

immInst = luiInst addiInst sltiInst andiInst oriInst xoriInst

Example logic expressions for control signals

RegWrite = luiInst arithInst logicInst lwInst jalInst

ALUSrc = immInst lwInst swInst

AddSub = subInst sltInst sltiInst

DataRead = lwInst

PCSrc0 = jInst jalInst syscallInst

Page 14: Skup instrukcija  mikropocesora MIPS

Putting It All Together

/

ALU

Data cache

Instr cache

Next addr

Reg file

op

jta

fn

inst

imm

rs (rs)

(rt)

Data addr

Data in 0

1

ALUSrc ALUFunc DataWrite

DataRead

SE

RegInSrc

rt

rd

RegDst RegWrite

32 / 16

Register input

Data out

Func

ALUOvfl

Ovfl

31

0 1 2

Next PC

Incr PC

(PC)

Br&Jump

ALU out

PC

0 1 2

Fig. 13.3

Control

addInst

subInstjInst

sltInst

. ..

.

. .

Fig. 10.19

AddSub

x y

y

x

Adder

c 32

c 0

k /

Shifter

Logic unit

s

Logic function

Amount

5

2

Constant amount

Variable amount

5

5

ConstVar

0

1

0

1

2

3

Function class

2

Shift function

5 LSBs Shifted y

32

32

32

2

c 31

32-input NOR

Ovfl Zero

32

32

MSB

ALU

y

x

s

Shorthand symbol for ALU

Ovfl Zero

Func

Control

0 or 1

AND 00 OR 01

XOR 10 NOR 11

00 Shift 01 Set less 10 Arithmetic 11 Logic

00 No shift 01 Logical left 10 Logical right 11 Arith right

imm

lui

Adder

jta imm

(rs)

(rt)

SE

SysCallAddr

PCSrc

(PC)

Branch condition checker

in c

1 0 1 2 3

/ 30

/ 32 BrTrue / 32

/ 30 / 30

/ 30

/ 30

/ 30

/ 30

/ 26

/ 30

/ 30 4 MSBs

30 MSBs

BrType

IncrPC

NextPC

/ 30 31:2

16

Fig. 13.4

4 MSBs

Page 15: Skup instrukcija  mikropocesora MIPS

13.6 Performance of the Single-Cycle Design

Fig. 13.6 The MicroMIPS data path unfolded (by depicting the register write step as a separate block) so as to better visualize the critical-path latencies.

Instruction access 2 nsRegister read 1 nsALU operation 2 nsData cache access 2 nsRegister write 1 ns Total 8 ns

Single-cycle clock = 125 MHz

P C

P C

P C

P C

P C

ALU-type

Load

Store

Branch

Jump

Not used

Not used

Not used

Not used

Not used

Not used

Not used

Not used

Not used

(and jr)

(except jr & jal)

R-type 44% 6 nsLoad 24% 8 nsStore 12% 7 nsBranch 18% 5 nsJump 2% 3 ns

Weighted mean 6.36 ns

Page 16: Skup instrukcija  mikropocesora MIPS

How Good is Our Single-Cycle Design?

Instruction access 2 nsRegister read 1 nsALU operation 2 nsData cache access 2 nsRegister write 1 ns Total 8 ns

Single-cycle clock = 125 MHz

Clock rate of 125 MHz not impressive

How does this compare with current processors on the market?

Not bad, where latency is concerned

A 2.5 GHz processor with 20 or so pipeline stages has a latency of about

0.4 ns/cycle 20 cycles = 8 ns

Throughput, however, is much better for the pipelined processor:

Up to 20 times better with single issue

Perhaps up to 100 times better with multiple issue

Page 17: Skup instrukcija  mikropocesora MIPS

A Multicycle Implementation

Fig. Single-cycle versus multicycle instruction execution.

Clock

Clock

Instr 2 Instr 1 Instr 3 Instr 4

3 cycles 3 cycles 4 cycles 5 cycles

Time saved

Instr 1 Instr 4 Instr 3 Instr 2

Time needed

Time needed

Time allotted

Time allotted

Page 18: Skup instrukcija  mikropocesora MIPS

A Multicycle Data Path

Fig. Abstract view of a multicycle instruction execution unit for MicroMIPS. For naming of instruction fields

ALU

Cache

Control

Reg file

op

jta

fn

imm

rs,rt,rd (rs)

(rt)

Address

Data

Inst Reg

Data Reg

x Reg

y Reg

z Reg PC

Page 19: Skup instrukcija  mikropocesora MIPS

Multicycle Data Path with Control Signals Shown

Fig. Key elements of the multicycle MicroMIPS data path.

Three major changes relative to the single-cycle data path:

1. Instruction & data caches combined

2. ALU performs double duty for address calculation

3. Registers added for intercycle data

/

16

rs

0 1

0 1 2

ALU

Cache Reg file

op

jta

fn

(rs)

(rt)

Address

Data

Inst Reg

Data Reg

x Reg

y Reg

z Reg PC

4

ALUSrcX

ALUFunc

MemWrite MemRead

RegInSrc

4

rd

RegDst

RegWrite

/

32

Func

ALUOvfl

Ovfl

31

PCSrc PCWrite

IRWrite

ALU out

0 1

0 1

0 1 2 3

0 1 2 3

InstData ALUSrcY

SysCallAddr

/

26

4

rt

ALUZero

Zero

x Mux

y Mux

0 1

JumpAddr

4 MSBs

/

30

30

SE

imm

2

Corrections are shown in red

Page 20: Skup instrukcija  mikropocesora MIPS

Clock Cycle and Control Signals Table Control signal 0 1 2 3

JumpAddr jta SysCallAddr

PCSrc1, PCSrc0 Jump addr x reg z reg ALU out

PCWrite Don’t write Write    

InstData PC z reg    

MemRead Don’t read Read    

MemWrite Don’t write Write    

IRWrite Don’t write Write    

RegWrite Don’t write Write    

RegDst1, RegDst0 rt rd $31  

RegInSrc1, RegInSrc0 Data reg z reg PC  

ALUSrcX PC x reg    

ALUSrcY1, ALUSrcY0 4 y reg imm 4 imm

AddSub Add Subtract    

LogicFn1, LogicFn0 AND OR XOR NOR

FnClass1, FnClass0 lui Set less Arithmetic Logic

Register file

ALU

Cache

Program counter

Page 21: Skup instrukcija  mikropocesora MIPS

Execution Cycles

Table Execution cycles for multicycle MicroMIPS

Instruction Operations Signal settingsAny Read out the instruction and

write it into instruction register, increment PC

InstData = 0, MemRead = 1IRWrite = 1, ALUSrcX = 0ALUSrcY = 0, ALUFunc = ‘+’PCSrc = 3, PCWrite = 1

Any Read out rs & rt into x & y registers, compute branch address and save in z register

ALUSrcX = 0, ALUSrcY = 3ALUFunc = ‘+’

ALU type Perform ALU operation and save the result in z register

ALUSrcX = 1, ALUSrcY = 1 or 2ALUFunc: Varies

Load/Store Add base and offset values, save in z register

ALUSrcX = 1, ALUSrcY = 2ALUFunc = ‘+’

Branch If (x reg) = < (y reg), set PC to branch target address

ALUSrcX = 1, ALUSrcY = 1ALUFunc= ‘’, PCSrc = 2PCWrite = ALUZero or ALUZero or ALUOut31

Jump Set PC to the target address jta, SysCallAddr, or (rs)

JumpAddr = 0 or 1,PCSrc = 0 or 1, PCWrite = 1

ALU type Write back z reg into rd RegDst = 1, RegInSrc = 1RegWrite = 1

Load Read memory into data reg InstData = 1, MemRead = 1

Store Copy y reg into memory InstData = 1, MemWrite = 1

Load Copy data register into rt RegDst = 0, RegInSrc = 0RegWrite = 1

Fetch & PC incr

Decode & reg read

ALU oper & PC update

Reg write or mem access

Reg write for lw

1

2

3

4

5

Page 22: Skup instrukcija  mikropocesora MIPS

The Control State Machine

Fig. The control state machine for multicycle MicroMIPS.

State 0 InstData = 0

MemRead = 1 IRWrite = 1

ALUSrcX = 0 ALUSrcY = 0 ALUFunc = ‘+’

PCSrc = 3 PCWrite = 1

Start

Cycle 1 Cycle 3 Cycle 2 Cycle 1 Cycle 4 Cycle 5

ALU- type

lw/ sw lw

sw

State 1

ALUSrcX = 0 ALUSrcY = 3 ALUFunc = ‘+’

State 5 ALUSrcX = 1 ALUSrcY = 1 ALUFunc = ‘ ’ JumpAddr = %

PCSrc = @ PCWrite = #

State 8

RegDst = 0 or 1 RegInSrc = 1 RegWrite = 1

State 7

ALUSrcX = 1 ALUSrcY = 1 or 2 ALUFunc = Varies

State 6

InstData = 1 MemWrite = 1

State 4

RegDst = 0 RegInSrc = 0 RegWrite = 1

State 2

ALUSrcX = 1 ALUSrcY = 2 ALUFunc = ‘+’

State 3

InstData = 1 MemRead = 1

Jump/ Branch

Notes for State 5: % 0 for j or jal, 1 for syscall, don’t-care for other instr’s @ 0 for j, jal, and syscall, 1 for jr, 2 for branches # 1 for j, jr, jal, and syscall, ALUZero () for beq (bne), bit 31 of ALUout for bltz For jal, RegDst = 2, RegInSrc = 1, RegWrite = 1

Note for State 7: ALUFunc is determined based on the op and fn f ields

Speculative calculation of branch address

Branches based on instruction

Page 23: Skup instrukcija  mikropocesora MIPS

State and Instruction Decoding

Fig. State and instruction decoders for multicycle MicroMIPS.

jrInst

norInst

sltInst

orInst xorInst

syscallInst

andInst

addInst

subInst

RtypeInst

bltzInst jInst

jalInst beqInst bneInst

sltiInst

andiInst oriInst xoriInst luiInst

lwInst

swInst

andiInst

1

0

1

2

3 4

5

10

12 13

14

15

35

43

63

8

op

De

cod

er

fn D

eco

de

r

/ 6 / 6 op fn

0

8

12

32

34

36 37

38 39

42

63

ControlSt0 ControlSt1 ControlSt2 ControlSt3 ControlSt4 ControlSt5

ControlSt8

ControlSt6 1

st D

eco

de

r

/ 4

st

0 1 2 3 4 5

7

12 13 14 15

8 9 10

6

11

ControlSt7

addiInst

Page 24: Skup instrukcija  mikropocesora MIPS

Control Signal Generation

Certain control signals depend only on the control state

ALUSrcX = ControlSt2 ControlSt5 ControlSt7RegWrite = ControlSt4 ControlSt8

Auxiliary signals identifying instruction classes

addsubInst = addInst subInst addiInstlogicInst = andInst orInst xorInst norInst andiInst oriInst xoriInst

Logic expressions for ALU control signals

AddSub = ControlSt5 (ControlSt7 subInst)FnClass1 = ControlSt7 addsubInst logicInst

FnClass0 = ControlSt7 (logicInst sltInst sltiInst)

LogicFn1 = ControlSt7 (xorInst xoriInst norInst)

LogicFn0 = ControlSt7 (orInst oriInst norInst)

Page 25: Skup instrukcija  mikropocesora MIPS

Performance of the Multicycle Design

Fig. The MicroMIPS data path unfolded (by depicting the register write step as a separate block) so as to better visualize the critical-path latencies.

P C

P C

P C

P C

P C

ALU-type

Load

Store

Branch

Jump

Not used

Not used

Not used

Not used

Not used

Not used

Not used

Not used

Not used

(and jr)

(except jr & jal)

R-type 44% 4 cyclesLoad 24% 5 cyclesStore 12% 4 cyclesBranch 18% 3 cyclesJump 2% 3 cycles

Contribution to CPIR-type 0.444 = 1.76Load 0.245 = 1.20Store 0.124 = 0.48Branch 0.183 = 0.54Jump 0.023 = 0.06

_____________________________

Average CPI 4.04

Page 26: Skup instrukcija  mikropocesora MIPS

How Good is Our Multicycle Design?

Clock rate of 500 MHz better than 125 MHz of single-cycle design, but still unimpressive

How does the performance compare with current processors on the market?

Not bad, where latency is concerned

A 2.5 GHz processor with 20 or so pipeline stages has a latency of about 0.4 20 = 8 ns

Throughput, however, is much better for the pipelined processor:

Up to 20 times better with single issue

Perhaps up to 100 with multiple issue

R-type 44% 4 cyclesLoad 24% 5 cyclesStore 12% 4 cyclesBranch 18% 3 cyclesJump 2% 3 cycles

Contribution to CPIR-type 0.444 = 1.76

Load 0.245 = 1.20Store 0.124 = 0.48Branch 0.183 = 0.54Jump 0.023 = 0.06

_____________________________

Average CPI 4.04

Cycle time = 2 nsClock rate = 500 MHz

Page 27: Skup instrukcija  mikropocesora MIPS

Microprogramming

State 0 InstData = 0

MemRead = 1 IRWrite = 1

ALUSrcX = 0 ALUSrcY = 0 ALUFunc = ‘+’

PCSrc = 3 PCWrite = 1

Start

Cycle 1 Cycle 3 Cycle 2 Cycle 1 Cycle 4 Cycle 5

ALU- type

lw/ sw lw

sw

State 1

ALUSrcX = 0 ALUSrcY = 3 ALUFunc = ‘+’

State 5 ALUSrcX = 1 ALUSrcY = 1 ALUFunc = ‘ ’ JumpAddr = %

PCSrc = @ PCWrite = #

State 8

RegDst = 0 or 1 RegInSrc = 1 RegWrite = 1

State 7

ALUSrcX = 1 ALUSrcY = 1 or 2 ALUFunc = Varies

State 6

InstData = 1 MemWrite = 1

State 4

RegDst = 0 RegInSrc = 0 RegWrite = 1

State 2

ALUSrcX = 1 ALUSrcY = 2 ALUFunc = ‘+’

State 3

InstData = 1 MemRead = 1

Jump/ Branch

Notes for State 5: % 0 for j or jal, 1 for syscall, don’t-care for other instr’s @ 0 for j, jal, and syscall, 1 for jr, 2 for branches # 1 for j, jr, jal, and syscall, ALUZero () for beq (bne), bit 31 of ALUout for bltz For jal, RegDst = 2, RegInSrc = 1, RegWrite = 1

Note for State 7: ALUFunc is determined based on the op and fn f ields

The control state machine resembles a program (microprogram)

Microinstruction

Fig. Possible 22-bit microinstruction format for MicroMIPS.

PC control

Cache control

Register control

ALU inputs

JumpAddr PCSrc

PCWrite

InstData MemRead

MemWrite IRWrite

FnType LogicFn

AddSub ALUSrcY

ALUSrcX RegInSrc

RegDst RegWrite

Sequence control

ALU function

2bits

23

Page 28: Skup instrukcija  mikropocesora MIPS

Symbolic Names for Microinstruction Field ValuesTable Microinstruction field values and their symbolic names. The default value for each unspecified field is the all 0s bit pattern.

Field name Possible field values and their symbolic names

PC control0001 1001 x011 x101 x111

PCjump PCsyscall PCjreg PCbranch PCnext

Cache control0101 1010 1100

CacheFetch CacheStore CacheLoad

Register control1000 1001 1011 1101

rt Data rt z rd z $31 PC

ALU inputs*000 011 101 110

PC 4 PC 4imm x y x imm

ALU function*

0xx10 1xx01 1xx10 x0011 x0111

+ <

x1011 x1111 xxx00

lui

Seq. control01 10 11

PCdisp1 PCdisp2 PCfetch* The operator symbol stands for any of the ALU functions defined above (except for “lui”).

10000 10001 10101 11010

x10

(imm)

Page 29: Skup instrukcija  mikropocesora MIPS

Control Unit for Microprogramming

Fig. Microprogrammed control unit for MicroMIPS .

Microprogram memory or PLA

op (from instruction register) Control signals to data path

Address 1

Incr

MicroPC

Data

0

Sequence control

0

1

2

3

Dispatch table 1

Dispatch table 2

Microinstruction register

fetch: ---------------

andi: ----------

Multiway branch

64 entries in each table

Page 30: Skup instrukcija  mikropocesora MIPS

Microprogram for MicroMIPS

Fig. The complete MicroMIPS microprogram.

fetch: PCnext, CacheFetch, PC+4 # State 0 (start)PC + 4imm, PCdisp1 # State 1

lui1: lui(imm) # State 7luirt z, PCfetch # State 8lui

add1: x + y # State 7addrd z, PCfetch # State 8add

sub1: x - y # State 7subrd z, PCfetch # State 8sub

slt1: x - y # State 7sltrd z, PCfetch # State 8slt

addi1: x + imm # State 7addirt z, PCfetch # State 8addi

slti1: x - imm # State 7sltirt z, PCfetch # State 8slti

and1: x y # State 7andrd z, PCfetch # State 8and

or1: x y # State 7orrd z, PCfetch # State 8or

xor1: x y # State 7xorrd z, PCfetch # State 8xor

nor1: x y # State 7norrd z, PCfetch # State 8nor

andi1: x imm # State 7andirt z, PCfetch # State 8andi

ori1: x imm # State 7orirt z, PCfetch # State 8ori

xori: x imm # State 7xorirt z, PCfetch # State 8xori

lwsw1: x + imm, mPCdisp2 # State 2lw2: CacheLoad # State 3

rt Data, PCfetch # State 4sw2: CacheStore, PCfetch # State 6j1: PCjump, PCfetch # State 5jjr1: PCjreg, PCfetch # State 5jrbranch1: PCbranch, PCfetch # State 5branchjal1: PCjump, $31PC, PCfetch # State 5jalsyscall1:PCsyscall, PCfetch # State 5syscall

Page 31: Skup instrukcija  mikropocesora MIPS

Exception Handling

Exceptions and interrupts alter the normal program flow

Examples of exceptions (things that can go wrong):

ALU operation leads to overflow (incorrect result is obtained) Opcode field holds a pattern not representing a legal operation Cache error-code checker deems an accessed word invalid Sensor signals a hazardous condition (e.g., overheating)

Exception handler is an OS program that takes care of the problem

Derives correct result of overflowing computation, if possible Invalid operation may be a software-implemented instruction

Interrupts are similar, but usually have external causes (e.g., I/O)

Page 32: Skup instrukcija  mikropocesora MIPS

Exception Control States

Fig. Exception states 9 and 10 added to the control state machine.

State 0 InstData = 0 MemRead = 1

IRWrite = 1 ALUSrcX = 0 ALUSrcY = 0 ALUFunc = ‘+’

PCSrc = 3 PCWrite = 1

Start

Cycle 1 Cycle 3 Cycle 2 Cycle 4 Cycle 5

ALU- type

lw/ sw lw

sw

State 1

ALUSrcX = 0 ALUSrcY = 3 ALUFunc = ‘+’

State 5 ALUSrcX = 1 ALUSrcY = 1 ALUFunc = ‘ ’ JumpAddr = %

PCSrc = @ PCWrite = #

State 8

RegDst = 0 or 1 RegInSrc = 1 RegWrite = 1

State 7

ALUSrcX = 1 ALUSrcY = 1 or 2 ALUFunc = Varies

State 6

InstData = 1 MemWrite = 1

State 4

RegDst = 0 RegInSrc = 0 RegWrite = 1

State 2

ALUSrcX = 1 ALUSrcY = 2 ALUFunc = ‘+’

State 3

InstData = 1 MemRead = 1

Jump/ Branch

State 10 IntCause = 0

CauseWrite = 1 ALUSrcX = 0 ALUSrcY = 0 ALUFunc = ‘ ’ EPCWrite = 1 JumpAddr = 1

PCSrc = 0 PCWrite = 1

State 9 IntCause = 1

CauseWrite = 1 ALUSrcX = 0 ALUSrcY = 0 ALUFunc = ‘ ’ EPCWrite = 1 JumpAddr = 1

PCSrc = 0 PCWrite = 1

Illegal operation

Overflow

Page 33: Skup instrukcija  mikropocesora MIPS

Single-Cycle Data Path

Fig. Key elements of the single-cycle MicroMIPS data path.

/

ALU

Data cache

Instr cache

Next addr

Reg file

op

jta

fn

inst

imm

rs (rs)

(rt)

Data addr

Data in 0

1

ALUSrc ALUFunc DataWrite

DataRead

SE

RegInSrc

rt

rd

RegDst RegWrite

32 / 16

Register input

Data out

Func

ALUOvfl

Ovfl

31

0 1 2

Next PC

Incr PC

(PC)

Br&Jump

ALU out

PC

0 1 2

Clock rate = 125 MHzCPI = 1 (125 MIPS)

Page 34: Skup instrukcija  mikropocesora MIPS

Multicycle Data Path

Fig. Key elements of the multicycle MicroMIPS data path.

Clock rate = 500 MHz

CPI 4 ( 125 MIPS)

/

16

rs

0 1

0 1 2

ALU

Cache Reg file

op

jta

fn

(rs)

(rt)

Address

Data

Inst Reg

Data Reg

x Reg

y Reg

z Reg PC

4

ALUSrcX

ALUFunc

MemWrite MemRead

RegInSrc

4

rd

RegDst

RegWrite

/

32

Func

ALUOvfl

Ovfl

31

PCSrc PCWrite

IRWrite

ALU out

0 1

0 1

0 1 2 3

0 1 2 3

InstData ALUSrcY

SysCallAddr

/

26

4

rt

ALUZero

Zero

x Mux

y Mux

0 1

JumpAddr

4 MSBs

/

30

30

SE

imm

2

Page 35: Skup instrukcija  mikropocesora MIPS

Getting the Best of Both Worlds

Single-cycle:Clock rate = 125 MHz

CPI = 1

Multicycle:Clock rate = 500 MHz

CPI 4

Pipelined:Clock rate = 500 MHz

CPI 1

Page 36: Skup instrukcija  mikropocesora MIPS

Pipelining Concepts

Fig. Pipelining in the student registration process.

Approval Cashier Registrar ID photo Pickup

Start here

Exit

1 2 3 4 5

Strategies for improving performance

1 – Use multiple independent data paths accepting several instructions that are read out at once: multiple-instruction-issue or superscalar 

2 – Overlap execution of several instructions, starting the next instruction before the previous one has run to completion: (super)pipelined

Page 37: Skup instrukcija  mikropocesora MIPS

Pipelined Instruction Execution

Fig. Pipelining in the MicroMIPS instruction execution process.

Cycle 7 Cycle 6 Cycle 5 Cycle 4 Cycle 3 Cycle 2 Cycle 1 Cycle 8

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Cycle 9

Instr cache

Instr cache

Instr cache

Instr cache

Instr cache

Data cache

Data cache

Data cache

Data cache

Data cache

Time dimension

Task dimension

In

str

1

In

str

2

In

str

3

In

str

4

In

str

5

Page 38: Skup instrukcija  mikropocesora MIPS

Alternate Representations of a Pipeline

Fig. Two abstract graphical representations of a 5-stage pipeline executing 7 tasks (instructions).

1

2

3

4

5

1

2

3

4

5

6

7

(a) Task-time diagram (b) Space-time diagram

Cycle

Instruction

Cycle

Pipeline stage

1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11

Start-up region

Drainage region

a

a

a

a

a

a

a

w

w

w

w

w

w

w

f

f

f

f

f

f

f

r

r

r

r

r

r

r

d

d

d

d

d

d

d

a a a a a a a

w w w w w w w

d d d d d d d

r r r r r r r

f f f f f f f

f = Fetch r = Reg read a = ALU op d = Data access w = Writeback

Except for start-up and drainage overheads, a pipeline can execute one instruction per clock tick; IPS is dictated by the clock frequency

Page 39: Skup instrukcija  mikropocesora MIPS

 

Pipelining Example in a Photocopier

Example

A photocopier with an x-sheet document feeder copies the first sheet in 4 s and each subsequent sheet in 1 s. The copier’s paper path is a 4-stage pipeline with each stage having a latency of 1s. The first sheet goes through all 4 pipeline stages and emerges after 4 s. Each subsequent sheet emerges 1s after the previous sheet. How does the throughput of this photocopier vary with x, assuming that loading the document feeder and removing the copies takes 15 s.

Solution

Each batch of x sheets is copied in 15 + 4 + (x – 1) = 18 + x seconds. A nonpipelined copier would require 4x seconds to copy x sheets. For x > 6, the pipelined version has a performance edge. When x = 50, the pipelining speedup is (4 50) / (18 + 50) = 2.94.

Page 40: Skup instrukcija  mikropocesora MIPS

Pipeline Stalls or Bubbles

Fig. Read-after-write data dependency and its possible resolution through data forwarding .

Cycle 7 Cycle 6 Cycle 5 Cycle 4 Cycle 3 Cycle 2 Cycle 1 Cycle 8

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

$5 = $6 + $7

$8 = $8 + $6

$9 = $8 + $2

sw $9, 0($3)

Data forwarding

Instr cache

Instr cache

Instr cache

Instr cache

Data cache

Data cache

Data cache

Data cache

First type of data dependency

Page 41: Skup instrukcija  mikropocesora MIPS

Inserting Bubbles in a Pipeline

Without data forwarding, three bubbles are needed to resolve a read-after-write data dependency

Cycle 7 Cycle 6 Cycle 5 Cycle 4 Cycle 3 Cycle 2 Cycle 1 Cycle 8

Reg file

Reg file ALU

Reg file

Reg file ALU

Reg file

Reg file ALU

Reg file

Reg file ALU

Reg file

Reg file ALU

Cycle 9

Instr cache

Instr cache

Instr cache

Instr cache

Instr cache

Data cache

Data cache

Data cache

Data cache

Data cache

Time dimension

Task dimension

In

str

1

In

str

2

In

str

3

In

str

4

In

str

5

Bubble

Bubble

Bubble

Writes into $8

Reads from $8

Cycle 7 Cycle 6 Cycle 5 Cycle 4 Cycle 3 Cycle 2 Cycle 1 Cycle 8

Reg file

Reg file ALU

Reg file

Reg file ALU

Reg file

Reg file ALU

Reg file

Reg file ALU

Reg file

Reg file ALU

Cycle 9

Instr cache

Instr cache

Instr cache

Instr cache

Instr cache

Data cache

Data cache

Data cache

Data cache

Data cache

Time dimension

Task dimension

In

str

1

In

str

2

In

str

3

In

str

4

In

str

5

Bubble

Bubble

Writes into $8

Reads from $8

Two bubbles, if we assume that a register can be updated and read from in one cycle

Page 42: Skup instrukcija  mikropocesora MIPS

Second Type of Data Dependency

Fig. Read-after-load data dependency and its possible resolution

through bubble insertion and data forwarding.

Cycle 7 Cycle 6 Cycle 5 Cycle 4 Cycle 3 Cycle 2 Cycle 1 Cycle 8

Data mem

Instr mem

Reg f ile

Reg file ALU

Data mem

Instr mem

Reg f ile

Reg file ALU

Data mem

Instr mem

Reg f ile

Reg file ALU

sw $6, . . .

lw $8, . . .

Insert bubble?

$9 = $8 + $2

Data mem

Instr mem

Reg f ile

Reg file ALU

Reorder?

Without data forwarding, three (two) bubbles are needed to resolve a read-after-load data dependency

Page 43: Skup instrukcija  mikropocesora MIPS

Control Dependency in a Pipeline

Fig. Control dependency due to conditional branch.

Cycle 7 Cycle 6 Cycle 5 Cycle 4 Cycle 3 Cycle 2 Cycle 1 Cycle 8

Data mem

Instr mem

Reg f ile

Reg file ALU

Data mem

Instr mem

Reg f ile

Reg file ALU

Data mem

Instr mem

Reg f ile

Reg file ALU

$6 = $3 + $5

beq $1, $2, . . .

Insert bubble?

$9 = $8 + $2

Data mem

Instr mem

Reg f ile

Reg file ALU

Reorder? (delayed branch)

Assume branch resolved here

Here would need 1-2 more bubbles

Page 44: Skup instrukcija  mikropocesora MIPS

Pipeline Timing and Performance

Fig. Pipelined form of a function unit with latching overhead.

Stage 1

Stage 2

Stage 3

Stage q 1

Stage q

t/q

Function unit

t

. . .

Latching of results

Page 45: Skup instrukcija  mikropocesora MIPS

Fig. Throughput improvement due to pipelining as a function of the number of pipeline stages for different pipelining overheads.

 

Throughput Increase in a q-Stage Pipeline

1 2 3 4 5 6 7 8 Number q of pipeline stages

Th

rou

gh

put i

mp

rove

men

t fa

cto

r

1

2

3

4

5

6

7

8

Ideal: /t = 0

/t = 0.1

/t = 0.05

tt / q +

or

q1 + q / t

Page 46: Skup instrukcija  mikropocesora MIPS

Assume that one bubble must be inserted due to read-after-load dependency and after a branch when its delay slot cannot be filled.Let be the fraction of all instructions that are followed by a bubble.

 

Pipeline Throughput with Dependencies

q(1 + q / t)(1 + )Pipeline speedup =

R-type 44% Load 24% Store 12% Branch 18% Jump 2%

Example

Calculate the effective CPI for MicroMIPS, assuming that a quarter of branch and load instructions are followed by bubbles.

Solution

Fraction of bubbles = 0.25(0.24 + 0.18) = 0.105CPI = 1 + = 1.105 (which is very close to the ideal value of 1)

Page 47: Skup instrukcija  mikropocesora MIPS

Pipelined Data Path Design

Fig. Key elements of the pipelined MicroMIPS data path. ALU

Data cache

Instr cache

Next addr

Reg file

op fn

inst

imm

rs (rs)

(rt)

Data addr

ALUSrc ALUFunc DataWrite DataRead

RegInSrc

rt

rd

RegDst RegWrite

Func

ALUOvfl

Ovfl

IncrPC

Br&Jump

PC

1 Incr

0

1

rt

31

0 1 2

NextPC

0

1

SeqInst

0 1 2

0 1

RetAddr

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5

SE

Address

Data

Page 48: Skup instrukcija  mikropocesora MIPS

Pipelined Control

Fig. Pipelined control signals. ALU

Data cache

Instr cache

Next addr

Reg file

op fn

inst

imm

rs (rs)

(rt)

Data addr

ALUSrc ALUFunc

DataWrite DataRead

RegInSrc

rt

rd

RegDst RegWrite

Func

ALUOvfl

Ovfl

IncrPC

Br&Jump

PC

1 Incr

0

1

rt

31

0 1 2

NextPC

0

1

SeqInst

0 1 2

0 1

RetAddr

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5

SE

5 3

2

Address

Data

Page 49: Skup instrukcija  mikropocesora MIPS

Optimal Pipelining

Fig. Higher-throughput pipelined data path for MicroMIPS and the execution of consecutive instructions in it .

Reg file

Data cache

Instr cache

Data cache

Instr cache

Data cache

Instr cache

Reg file

Reg f ile ALU

Reg file

Reg f ile ALU

Reg file

Reg f ile ALU

Instruction fetch

Register readout

ALU operation

Data read/store

Register writeback

PC

MicroMIPS pipeline with more than four-fold improvement

Page 50: Skup instrukcija  mikropocesora MIPS

 

Optimal Number of Pipeline Stages

Fig. Pipelined form of a function unit with latching overhead.

Stage 1

Stage 2

Stage 3

Stage q 1

Stage q

t/q

Function unit

t

. . .

Latching of results Assumptions:

Pipeline sliced into q stagesStage overhead is q/2 bubbles per branch (decision made midway)Fraction b of all instructions are taken branches

Derivation of q opt

Average CPI = 1 + b q / 2

Throughput = Clock rate / CPI = [(t / q + )(1 + b q / 2)] –1/2

Differentiate throughput expression with respect to q and equate with 0

q opt = [2(t / ) / b)]

1/2

Page 51: Skup instrukcija  mikropocesora MIPS

Data Dependencies and Hazards

Fig. Data dependency in a pipeline.

Cycle 7 Cycle 6 Cycle 5 Cycle 4 Cycle 3 Cycle 2 Cycle 1 Cycle 8

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Cycle 9

$2 = $1 - $3

Instructions that read register $2

Instr cache

Instr cache

Instr cache

Instr cache

Instr cache

Data cache

Data cache

Data cache

Data cache

Data cache

Page 52: Skup instrukcija  mikropocesora MIPS

Fig. When a previous instruction writes back a value computed by the ALU into a register, the data dependency can always be resolved through forwarding.

 

Resolving Data Dependencies via Forwarding

Cycle 7 Cycle 6 Cycle 5 Cycle 4 Cycle 3 Cycle 2 Cycle 1 Cycle 8

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Cycle 9

$2 = $1 - $3

Instructions that read register $2

Instr cache

Instr cache

Instr cache

Instr cache

Data cache

Data cache

Data cache

Data cache

Page 53: Skup instrukcija  mikropocesora MIPS

Fig. When the immediately preceding instruction writes a value read out from the data memory into a register, the data dependency cannot be resolved through forwarding (i.e., we cannot go back in time) and a bubble must be inserted in the pipeline.

 

Certain Data Dependencies Lead to Bubbles Cycle 7 Cycle 6 Cycle 5 Cycle 4 Cycle 3 Cycle 2 Cycle 1 Cycle 8

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Reg f ile

Reg file ALU

Cycle 9

lw $2,4($12)

Instructions that read register $2

Instr cache

Instr cache

Instr cache

Instr cache

Data cache

Data cache

Data cache

Data cache

Page 54: Skup instrukcija  mikropocesora MIPS

Data Forwarding

Fig. Forwarding unit for the pipelined MicroMIPS data path.

(rt)

0 1 2

SE

ALU

Data cache

RegInSrc4

Func

Ovfl

0 1

0 1

RegWrite4 RetAddr3

(rs)

ALUSrc1

Stage 3 Stage 4 Stage 5 Stage 2

x3

y3

x4

y4 x3 y3

x4 y4

RegWrite3

d3 d4

x3 y3

x4 y4

Reg file

rs rt

RegInSrc3

ALUSrc2

s2

t2

d4 d3

RetAddr3, RegWrite3, RegWrite4 RegInSrc3, RegInSrc4

RetAddr3, RegWrite3, RegWrite4 RegInSrc3, RegInSrc4

d4 d3

Forwarding unit, upper

Forwarding unit, lower

x2

y2

Page 55: Skup instrukcija  mikropocesora MIPS

Design of the Data Forwarding Units

Fig. Forwarding unit for the pipelined MicroMIPS data path.

(rt)

0 1 2

SE

ALU

Data cache

RegInSrc4

Func

Ovfl

0 1

0 1

RegWrite4 RetAddr3

(rs)

ALUSrc1

Stage 3 Stage 4 Stage 5 Stage 2

x3

y3

x4

y4 x3 y3

x4 y4

RegWrite3

d3 d4

x3 y3

x4 y4

Reg file

rs rt

RegInSrc3

ALUSrc2

s2

t2

d4 d3

RetAddr3, RegWrite3, RegWrite4 RegInSrc3, RegInSrc4

RetAddr3, RegWrite3, RegWrite4 RegInSrc3, RegInSrc4

d4 d3

Forwarding unit, upper

Forwarding unit, lower

x2

y2

RegWrite3 RegWrite4 s2matchesd3 s2matchesd4 RetAddr3 RegInSrc3 RegInSrc4 Choose

0 0 x x x x x x2

0 1 x 0 x x x x2

0 1 x 1 x x 0 x4

0 1 x 1 x x 1 y4

1 0 1 x 0 1 x x3

1 0 1 x 1 1 x y3

1 1 1 1 0 1 x x3

Table Partial truth table for the upper forwarding unit in the pipelined MicroMIPS data path.

Let’s focus on designing the upper data forwarding unit

Incorrect in textbook

Page 56: Skup instrukcija  mikropocesora MIPS

 

Hardware for Inserting Bubbles

Fig. Data hazard detector for the pipelined MicroMIPS data path.

(rt)

0 1 2

(rs)

Stage 3 Stage 2

Reg file

rs rt

t2

Data hazard detector

x2

y2

Control signals from decoder

DataRead2

Instr cache

LoadPC

Stage 1

PC Inst reg

All-0s

0 1

Bubble

Controls or all-0s

Page 57: Skup instrukcija  mikropocesora MIPS

Pipeline Branch Hazards

Software-based solutions

Compiler inserts a “no-op” after every branch (simple, but wasteful)

Branch is redefined to take effect after the instruction that follows it

Branch delay slot(s) are filled with useful instructions via reordering

Hardware-based solutions

Mechanism similar to data hazard detector to flush the pipeline

Constitutes a rudimentary form of branch prediction:Always predict that the branch is not taken, flush if mistaken

More elaborate branch prediction strategies possible

Page 58: Skup instrukcija  mikropocesora MIPS

Branch Prediction

Predicting whether a branch will be taken

Always predict that the branch will not be taken

Use program context to decide (backward branch is likely taken, forward branch is likely not taken)

Allow programmer or compiler to supply clues

Decide based on past history (maintain a small history table); to be discussed later

Apply a combination of factors: modern processors use elaborate techniques due to deep pipelines

Page 59: Skup instrukcija  mikropocesora MIPS

 

A Simple Branch Prediction Algorithm

Fig. Four-state branch prediction scheme.

Not taken

Predict taken

Predict taken again

Predict not taken

Predict not taken

again

Not taken Taken

Not taken Taken

Taken Not taken

Taken

Example

L1: ---- ----L2: ---- ---- br <c2> L2 ---- br <c1> L1

20 iter’s

10 iter’sImpact of different branch prediction schemes

Solution

Always taken: 11 mispredictions, 94.8% accurate1-bit history: 20 mispredictions, 90.5% accurate2-bit history: Same as always taken

Page 60: Skup instrukcija  mikropocesora MIPS

 

Hardware Implementation of Branch Prediction

Fig. Hardware elements for a branch prediction scheme.

The mapping scheme used to go from PC contents to a table entry is the same as that used in direct-mapped caches (Chapter 18)

Compare

Addresses of recent branch instructions

Target addresses

History bit(s) Low-order

bits used as index

Logic From PC

Incremented PC

Next PC

0

1

=

Read-out table entry

Page 61: Skup instrukcija  mikropocesora MIPS

Advanced Pipelining

Fig. Dynamic instruction pipeline with in-order issue, possible out-of-order completion, and in-order retirement.

Deep pipeline = superpipeline; also, superpipelined, superpipeliningParallel instruction issue = superscalar, j-way issue (2-4 is typical)

Stage 1

Instr cache

Instruction fetch

Function unit 1

Function unit 2

Function unit 3

Stage 2 Stage 3 Stage 4 Variable # of stages Stage q 2 Stage q 1 Stage q

Ope- rand prep

Instr decode

Retirement & commit stages

Instr issue

Stage 5

Page 62: Skup instrukcija  mikropocesora MIPS

Performance Improvement for Deep Pipelines

Hardware-based methods

Lookahead past an instruction that will/may stall in the pipeline(out-of-order execution; requires in-order retirement)

Issue multiple instructions (requires more ports on register file)Eliminate false data dependencies via register renamingPredict branch outcomes more accurately, or speculate

Software-based method

Pipeline-aware compilationLoop unrolling to reduce the number of branches

Loop: Compute with index i Loop: Compute with index i

Increment i by 1 Compute with index i + 1

Go to Loop if not done Increment i by 2

Go to Loop if not done

Page 63: Skup instrukcija  mikropocesora MIPS

 

CPI Variations with Architectural Features

Table Effect of processor architecture, branch prediction methods, and speculative execution on CPI.

Architecture Methods used in practice CPI

Nonpipelined, multicycle Strict in-order instruction issue and exec 5-10

Nonpipelined, overlapped In-order issue, with multiple function units 3-5

Pipelined, static In-order exec, simple branch prediction 2-3

Superpipelined, dynamic Out-of-order exec, adv branch prediction 1-2

Superscalar 2- to 4-way issue, interlock & speculation 0.5-1

Advanced superscalar 4- to 8-way issue, aggressive speculation 0.2-0.5

Page 64: Skup instrukcija  mikropocesora MIPS

Development of Intel’s Desktop/Laptop Micros

In the beginning, there was the 8080; led to the 80x86 = IA32 ISA

Half a dozen or so pipeline stages

802868038680486Pentium (80586)

A dozen or so pipeline stages, with out-of-order instruction execution

Pentium ProPentium IIPentium IIICeleron

Two dozens or so pipeline stages

Pentium 4

More advanced technology

More advanced technology

Instructions are broken into micro-ops which are executed out-of-order but retired in-order

Page 65: Skup instrukcija  mikropocesora MIPS

Dealing with Exceptions

Exceptions present the same problems as branches

How to handle instructions that are ahead in the pipeline?(let them run to completion and retirement of their

results)

What to do with instructions after the exception point?(flush them out so that they do not affect the state)

Precise versus imprecise exceptions

Precise exceptions hide the effects of pipelining and parallelism by forcing the same state as that of strict sequential execution

(desirable, because exception handling is not complicated)

Imprecise exceptions are messy, but lead to faster hardware(interrupt handler can clean up to offer precise

exception)

Page 66: Skup instrukcija  mikropocesora MIPS

The Three Hardware Designs for MicroMIPS

/

ALU

Data cache

Instr cache

Next addr

Reg file

op

jta

fn

inst

imm

rs (rs)

(rt)

Data addr

Data in 0

1

ALUSrc ALUFunc DataWrite

DataRead

SE

RegInSrc

rt

rd

RegDst RegWrite

32 / 16

Register input

Data out

Func

ALUOvfl

Ovfl

31

0 1 2

Next PC

Incr PC

(PC)

Br&Jump

ALU out

PC

0 1 2

Single-cycle

/

16

rs

0 1

0 1 2

ALU

Cache Reg file

op

jta

fn

(rs)

(rt)

Address

Data

Inst Reg

Data Reg

x Reg

y Reg

z Reg PC

4

ALUSrcX

ALUFunc

MemWrite MemRead

RegInSrc

4

rd

RegDst

RegWrite

/

32

Func

ALUOvfl

Ovfl

31

PCSrc PCWrite

IRWrite

ALU out

0 1

0 1

0 1 2 3

0 1 2 3

InstData ALUSrcY

SysCallAddr

/

26

4

rt

ALUZero

Zero

x Mux

y Mux

0 1

JumpAddr

4 MSBs

/

30

30

SE

imm

Multicycle

ALU

Data cache

Instr cache

Next addr

Reg file

op fn

inst

imm

rs (rs)

(rt)

Data addr

ALUSrc ALUFunc

DataWrite DataRead

RegInSrc

rt

rd

RegDst RegWrite

Func

ALUOvfl

Ovfl

IncrPC

Br&Jump

PC

1 Incr

0

1

rt

31

0 1 2

NextPC

0

1

SeqInst

0 1 2

0 1

RetAddr

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5

SE

5 3

2

Pipelined

125 MHzCPI = 1

500 MHzCPI 4

500 MHzCPI 1.1

Page 67: Skup instrukcija  mikropocesora MIPS

Main MIPS assembly language instruction set

Category

Instruction Example Meaning Comments

Arithmetic

add add $sl, $s2, $s3

$s1 = $s2 + $s3

Three operands; overflow detected

subtract sub $s1, $s2, $s3

$s1 = $s2 - $s3 Three operands; overflow detected

add immediate addi $s1, $s2, 100

$s1 = $s2 + 100

+ constant; overflow detected

add unsigned addu $s1, $s2, $s3

$s1 = $s2 + $s3

Three operands; overflow undetected

subtract unsigned subu $sl, $s2, $s3

$s1 = $s2 - $s3 Three operands; overflow undetected

add immediate unsigned

addiu $s1, $s2, 100

$s1 = $s2 + 100

+ constant; overflow undetected

move from coporoc. reg.

mfc0 $s1, $epc $s1 = $epc Used to copy Exception PC plus other special registers

multiply mult $s2, $s3 Hi, Lo = $s2 * $s3

64-bit signed product in Hi, Lo

multiply unsigned multu $s2, $s3 Hi, Lo = $s2 * $s3

64-bit unsigned product in Hi, Lo

divide div $s2, $s3 Lo = $s2/ $s3, Hi = $s2 mod $s3

Lo = quotient, Hi = remainder

divide unsigned divu $s2, $s3 Lo = $s2/ $s3, Hi = $s2 mod $s3

Unsigned quotient and remainder

move from Hi mfhi $s1 $s1 = Hi Used to get copy of Hi

move from Lo mflo $s1 $s1 = Lo Used to get copy of Lo

Page 68: Skup instrukcija  mikropocesora MIPS

Category

Instruction Example Meaning Comments

Logical and and $s1, $s2, $s3

$s1 = $s2 & $s3 Three reg. operands; logical AND

or or $s1, $s2, $s3

$s1 = $s2 ! $s3 Three reg. operands; logical OR

and immediate

andi $s1, $s2, 100

$s1 = $s2 & 100 Logical AND reg. constant

or ori $s1, $s2, 100

$s1 = $s2 ! 100 Logical OR reg. constant

shiftr left logical

sll $s1, $s2, 10 $s1 = $s2 << 10 Shift left by constant

shift right logical

srl $s1, $s2, 10 $s1 = $s2 >> 10 Shift right by constant

Data transfer

load word lw $s1, 100($s2)

$s1 = Memory [$s2+100]

Word from memory to register

store word sw $s1, 100($s2)

Memory [$s2+100]= $s1

Word from register to memory

load byte unsigned

lbu $s1, 100($s2)

$s1 = Memory [$s2+100]

Byte from memory to register

store byte sb $s1, 100($s2)

Memory [$s2+100]= $s1

Byte from register to memory

load upper immediate

lui $s1, 100 $s1 = 100*216 Loads constant in upper 16 bits

Main MIPS assembly language instruction set– cont.

Page 69: Skup instrukcija  mikropocesora MIPS

Category

Instruction Example Meaning Comments

Conditional branch

branch on equal

beq $s1, $s2, 25

if ($s1 ==$s2) go to PC + 4 + 100

Equal test; PC relative branch

branch on not equal

bne $s1, $s2, 25

if ($s1!=$s2) go to PC + 4 + 100

Not equal test; PC relative

set on less than

slt $s1, $s2, $s3

if($s2< $s3) $s1 = 1; else $s1 = 0

Compare less than; two's complement

set less than immediate

slti $s1, $s2, 100

if ($s2< 100) $s1 = 1; else $s1 = 0

Compare < constant; two's complement

set less than unsigned

sltu $s1, $s2, $s3

if($s2< $s3) $s1 = 1; else $s1 = 0

Compare less than; natural numbers

set less than unsigned immediate

sltiu $s1, $s2, 100

if($s2< 100) $s1 = 1; else $s1 = 0

Compare < constant; natural numbers

Unconditi-onal jump

jump j 2500 go to 10000 Jump to target address

jump register jr $ra go to $ra For switch, procedure return

jump and link jal 2500 $ra =PC+4; go to 10000

For procedure call

Main MIPS assembly language instruction set– cont.

Page 70: Skup instrukcija  mikropocesora MIPS

MIPS machine Language Example

Name Format

6 bits

5 bits

5 bits 5 bits 5 bits

6 bits

Coments 

add R 0 2 3 1 0 32 add $1, $2, $3

 

sub R 0 2 3 1 0 34 sub $1, $2, $3

 

addi I 8 2 1 100 addi $1, $2,100

 

addu R 0 2 3 1 0 33 addu $1, $2, $3

 

subu R 0 2 3 1 0 35 subu $1, $2, $3

 

addiu I 9 2 1 100 addiu $1,$2,100

 

mfc0 R 16 0 1 14 0 0 mfc0 $1, $epc

 

mult R 0 2 3 0 0 24 mult $2, $3  

multu R 0 2 3 0 0 25 multu $2, $3  

div R 0 2 3 0 0 26 div $2, $3  

divu R 0 2 3 0 0 27 divu $2, $3  

mfhi R 0 0 0 1 0 16 mfhi $1  

mflo R 0 0 0 1 0 18 mflo $1  

and R 0 2 3 1 0 36 and $1, $2, $3

 

or R 0 2 3 1 0 37 or $1, $2, $3  

Page 71: Skup instrukcija  mikropocesora MIPS

Example

Name Format

6 bits 5 bits 5 bits 5 bits

5 bits

6 bits Coments 

andi I 12 2 1 100 andi $1, $2,100

 

ori I 13 2 1 100 ori $1, $2,100

 

sll R 0 0 2 1 10

0 sll $1, $2, 10 

srl R 0 0 2 1 10

2 srl $1, $2, 10

 

lw I 35 2 1 100 lw $1,100($2)

 

sw I 43 2 1 100 sw $1,100($2)

 

lui I 15 0 1 100 lui $1,100  

beq I 4 1 2 25 beq $1,$2,100

 

bne I 5 1 2 25 bne $1,$2,100

 

slt R 0 2 3 1 0 42 slt $1, $2, $3

 

slti I 10 2 1 100 slti $1,$2,100

 

sltu R 0 2 3 1 0 43 sltu $1, $2, $3

 

sltiu I 11 2 1 100 sltiu $1,$2,100

 

j J 2 2500 j 10000  

jr R 0 31 0 0 0 8 jr $31  

jal J 3 2500 jal 10000  

MIPS machine Language – cont.

Page 72: Skup instrukcija  mikropocesora MIPS

MIPS Instruction formats

Name Flelds

Comments

Field size

6 bits

5 bits 5 bits

5 bits 5 bits

6 bits

All MIPS instructions 32 bits

R-format

op rs rt rd shamt

funct Arithemetic instruction format

I-Format

op rs rt address/immediate Arithemetic instruction format

J-Format

op target address Jump instruction format

Main MIPS machine language, Formats and examples are shown, with values in each field: op and funct fields form the opcode (each 6 bits), rs field gives a source register (5 bits), rt is also normally a source register (5 bits), rd is the destination register (5 bits), and shamt supplies the shift amount (5 bits). The field values are all in decimal. Floating-point machine language instructions are shown in Figure 4.47 on page 291. Appendix A gives the full MIPS machine language.

Page 73: Skup instrukcija  mikropocesora MIPS

MIPS floating-point operands

Name Example Comments

32 floating point registers

$f0, $f1, $f2..., $f31,

MIPS floating -point registers are used in pairs for double precision numbers.

230 memory words

Memory [0], Memory [4], ....,Memory [4294967292]

Accessed only by data transfer instructions. MIPS uses byte addresses, so sequential words differ by 4. Memory holds data structures, such as arrays, and spilled registers, such as those saved on procedure calls.

Page 74: Skup instrukcija  mikropocesora MIPS

MIPS floating-point assembly language

Category

Instruction Example Meaning Comments

Arthmetic

FP add single add.s $f2, $f4, $f6

$f2 = $f4 + $f6 FP add (single precision)

FP subtract single

sub.s $f2, $f4, $f6

$f2 = $f4 - $f6 FP sub (single precision)

FP multiply single

mul.s $f2, $f4, $f6

$f2 = $f4 * $f6 FP. multiply(single precision)

FP divide single

div.s $f2, $f4, $f6

$f2 = $f4 / $f6 FP divide (single precision)

FP add double add.d $f2, $f4, $f6

$f2 = $f4 + $f6 FP add (double precision)

FP subtract double

sub.d $f2, $f4, $f6

$f2 = $f4 - $f6 FP sub (double precision)

FP multiply double

mul.d $f2, $f4, $f6

$f2 = $f4 * $f6 FP multiply (double precis.)

FP divide double

div.d $f2, $f4, $f6

$f2 = $f4 / $f6 FP divide (double precis.)

Data transfer

load word copr.1

lwc1 $f1, 100($s2)

$f1 = Mem [$s2+100] 32-bit data to FP register

store word copr.1

swc1 $f1,100($s2)

Mem[$s2+100] =$f1 32-bit data to memory

Conditional branch

branch on FP true

bclt 25 if(cond==1) go to PC+4+100

PC relative branch if FP cond.

branch on FP false

bclf 25 if(cond==0) go to PC+4+100

PC relative branch if not cond.

FP compare single (eq.ne.lt.le.gt.ge)

c.lt.s $f2, $f4 if($f2 < $f4) cond=1; else cond = 0

FP compare less than single precision

FP compare double

c.lt.d $f2, $f4 if($f2 <$f4) (eq.ne.lt.le.gt.ge) cond=1; else cond = 0

FP compare less than single precision

Page 75: Skup instrukcija  mikropocesora MIPS

MIPS floating-point machine language

Name Format

Example Comments

add.s R 17 16 6 4 2 0 add.s $f2, $f4 $f6

sub.s R 17 16 6 4 2 1 sub.s $f2, $f4 $f6

mul.s R 17 16 6 4 2 2 mul.s $f2, $f4 $f6

div.s R 17 16 6 4 2 3 div.s $f2, $f4 $f6

add.d R 17 17 6 4 2 0 add.d $f2, $f4 $f6

sub.d R 17 17 6 4 2 1 sub.d $f2, $f4 $f6

mul.d R 17 17 6 4 2 2 mul.d $f2, $f4 $f6

div.d R 17 17 6 4 2 3 div.d $f2, $f4 $f6

lwc1 I 49 20 2 100 lwc1 $f2, 100($s4)

swc1 I 57 20 2 100 swc1 $f2, 100($s4)

bc1t I 17 8 1 25 bc1t 25

bc1f I 17 8 0 25 bc1f 25

c.lt.s R 17 16 4 2 0 60 c.lt.s $f2, $f4

c.lt.d R 17 17 4 2 0 60 c.lt.d $f2, $f4

Field size

6 bits

5 bits

5 bits 5 bits 5 bits 6 bits All MIPS instruction 32 bits

Page 76: Skup instrukcija  mikropocesora MIPS

Arithmetic and Logical Instructions

Addition (with overflow)

add rd,rs,rt 6 5 5 5 5 6

0 rs rt rd 0 0x20

Absolute value

abs rdest,rsrc pseudoinstruction

Put the absolute value of register rsrc in register rdest.

Addition (without overflow)

addu rd,rs,rt 6 5 5 5 5 6

Put the sum of registers rs and rt into register rd.

0 rs rt rd 0 0x21

Page 77: Skup instrukcija  mikropocesora MIPS

Addition immediate (without overflow)

addi rt,rs,imm 6 5 5 16

Arithmetic and Logical Instructions – con.

8 rs rt imm

Addition immediate (without overflow)

addi rt,rs,imm 6 5 5 16

Put the sum of registers rs and sign-extended immediate into register rt.

9 rs rt imm

Page 78: Skup instrukcija  mikropocesora MIPS

AND

and rd,rs,rt 6 5 5 5 5 6

Put the logical AND of registers rs and rt into register rd.

Arithmetic and Logical Instructions – con.

And immediate

andi rd,rs,imm 6 5 5 16

Put the logical AND of register rs and the zero-extended immediate into register rt.

0xc rs rt imm

0 rs rt rd 0 0x24

Page 79: Skup instrukcija  mikropocesora MIPS

Arithmetic and Logical Instructions – con.

Divide (with overflow)

div rs,rt 6 5 5 10 6

Divide (with overflow)

div rdest,rsrc 1,src2 pseudoinstruction

Divide (without overflow)

divu rs,rt 6 5 5 10 6

0 rs rt 0 0x1a

0 rs rt 0 0x1b

Divide (without overflow)

div rdest,rsrc 1,src2 pseudoinstruction

Put the quotient of register rsrc1 and src2 into register rdest.

Divide register rs by register rt. Leave the quotient in register lo and the remainder in register hi.

Page 80: Skup instrukcija  mikropocesora MIPS

Arithmetic and Logical Instructions – con.

Multiply

mult rs,rt 6 5 5 10 6

Multiply (without overflow)

mul rdest,rsrc 1,src2 pseudoinstruction

Unsigned multiply

multu rs,rt 6 5 5 10 6

0 rs rt 0 0x18

0 rs rt 0 0x19

Unsigned multiply (with overflow)

mulou rdest,rsrc1,src2 pseudoinstruction

Put the product of register rsrc1 and src2 into register rdest.

Multiply (with overflow)

mulo rdest,rsrc 1,src2 pseudoinstruction

Multiply register rs and rt. Leave the low-order word of the product in register lo and the high-order word in register hi.

Page 81: Skup instrukcija  mikropocesora MIPS

Arithmetic and Logical Instructions – con.

NOR

nor rd,rs,rt 6 5 5 5 5 6

Put the logical NOR of registers rs and rt into register rd.

0 rs rt rd 0 0x27

Negate value (with overflow)

neg rdest,rsrc pseudoinstruction

NOT

not rdest,rsrc pseudoinstruction

Put the sum of registers rs and rt into register rd.

Negate value (without overflow)

negu rdest,rsrc pseudoinstruction

Put the negative of register rsrc into register rdst.

Page 82: Skup instrukcija  mikropocesora MIPS

OR

or rd,rs,rt 6 5 5 5 5 6

Put the logical OR of registers rs and rt into register rd.

OR immediate

ori rt,rs,imm 6 5 5 16

Put the logical OR of register rs and the zero-extended immediate into register rt.

0xd rs rt imm

0 rs rt rd 0 0x25

Arithmetic and Logical Instructions – con.

Remainder

rem rdest,rsrc1,rsrc2 pseudoinstruction

Unsigned remainder

remu rdest,rsrc1,rsrc2 pseudoinstructionPut the remainder od register rsrc1 divided by register rsrc2 into register rdest. Note that if an operand is negative, the remainder is unspecified by the MIPS architecture and depends on the convention of the machine on which SPIM is run.

Page 83: Skup instrukcija  mikropocesora MIPS

Arithmetic and Logical Instructions – con.

Shift left logical

sll rd,rt,shamt 6 5 5 5 5 6

0 rs rt rd shamt 0

Shift left logical variable

sllv rd,rt,rs 6 5 5 5 5 6

0 rs rt rd 0 4

Shift right arithmetic

sra rd,rt,shamt 6 5 5 5 5 6

0 rs rt rd shamt 3

Shift right arithmetic variable

srav rd,rt,rs 6 5 5 5 5 6

0 rs rt rd 0 7

Page 84: Skup instrukcija  mikropocesora MIPS

Arithmetic and Logical Instructions – con.

Shift right logical

sra rd,rt,shamt 6 5 5 5 5 6

0 rs rt rd shamt 2

Shift right logical variable

srlv rd,rt,rs 6 5 5 5 5 6

Shift register rt left (right) by the distance indicated by immediate shamt or the register rs and put the result in register rd.

0 rs rt rd 0 6

Rotate left

rol rdest,rsrc1,rsrc2 pseudoinstruction

Rotate right

ror rdest,rsrc1,rsrc2 pseudoinstructionRotate register rsrc1 left(right) by the distance indicated by rsrc2 and put the result in register rdest.

Page 85: Skup instrukcija  mikropocesora MIPS

Arithmetic and Logical Instructions – con.

Subtract (with overflow)

sub rd,rs,rt 6 5 5 5 5 6

Subtract (without overflow)

subu rd,rs,rt 6 5 5 5 5 6

Put the difference of register rs and rt into register rd.

0 rs rt rd 0 0x22

0 rs rt rd 0 0x23

Exclusive OR

xor rd,rs,rt 6 5 5 5 5 6

Put the logical XOR of registers rs and the zero-extended immediate into register rt.

0 rs rt rd 0 0x26

XOR immediate

xori rt,rs,imm 6 5 5 16

Put the logical XOR of register rs and the zero-extended immediate into register rt.

0xe rs rt imm

Page 86: Skup instrukcija  mikropocesora MIPS

Constant-Manipulating Instructions

Load upper immediate

lui rt,imm 6 5 5 16

Load the lower halfword of the immediate imm into the upper halfword of registers rt. The lower bits of the register are set to 0.

0xf rs rt imm

Load immediate

li rdest,imm pseudoinstructionMove the immediate imm into register rdest.

Page 87: Skup instrukcija  mikropocesora MIPS

Comparison Instructions Set less than

slt rd,rs,rt 6 5 5 5 5 6

Set less than unsigned

sltu rd,rs,rt 6 5 5 5 5 6

Set register rd to 1 if register rs is less than rt, and to 0 otherwise.

0 rs rt rd 0 0x2a

0 rs rt rd 0 0x2b

Set less than immediate

slti rd,rs,imm 6 5 5 16

Set less than unsigned immediate

sltiu rd,rs,imm 6 5 5 16

Set register rd to 1 if register rs is less than the sign-extended immediate, and to 0 otherwise.

0xb rs rd imm

0xa rs rd imm

Page 88: Skup instrukcija  mikropocesora MIPS

Comparison Instructions – cont.

Set greater than equal

sge rdest,rsrc1,rsrc2 pseudoinstruction

Set equal

seq rdest,rsrc1,rsrc2 pseudoinstruction

Set register rdest to 1 if register rsrc1 equals rsrc2, and to 0 otherwise.

Set greater than equal unsigned

sgeu rdest,rsrc1,rsrc2 pseudoinstructionSet register rdest to 1 if register rsrc1 is greater than or equal to rsrc2, and to 0 otherwise.

Set greater than

sgt rdest,rsrc1,rsrc2 pseudoinstruction

Page 89: Skup instrukcija  mikropocesora MIPS

Comparison Instructions – cont.

Set greater than unsigned

sgtu rdest,rsrc1,rsrc2 pseudoinstructionSet register rdest to 1 if register rsrc1 is greater than rsrc2, and to 0 otherwise.

Set less than equal

sle rdest,rsrc1,rsrc2 pseudoinstruction

Set less than equal unsigned

sleu rdest,rsrc1,rsrc2 pseudoinstructionSet register rdest to 1 if register rsrc1 is less than or equal to rsrc2, and to 0 otherwise.

Set not equal

sne rdest,rsrc1,rsrc2 pseudoinstructionSet register rdest to 1 if register rsrc1 is not equal to rsrc2, and to 0 otherwise.

Page 90: Skup instrukcija  mikropocesora MIPS

Branch Instructions

Branch Instruction

b label pseudoinstructionUnconditionally branch to the instruction at the label.

Branch coprocessor z true

bczt label 6 5 5 16

Branch coprocessor z false

bczf label 6 5 5 16

Conditionally branch the number of instructions specified by the offset if z's condition flag is true (false). z is 0, 1, 2, or 3. The floating-point unit is z = 1.

0x1z 8 1 offset

0x1z 8 0 offset

Page 91: Skup instrukcija  mikropocesora MIPS

Branch Instructions – cont.

Branch on equal

beq rs,rt,label 6 5 5 16

Conditionally branch the number of instructions specified by the offset if register rs equals rt.

Branch on greater than equal zero

bgez rs,label 6 5 5 16

Conditionally branch the number of instructions specified by the offset if register rs is greater than or equal to 0.

4 rs rt offset

1 rs 1 offset

Branch on greater than equal zero and link

bgezal rs,label 6 5 5 16

Conditionally branch the number of instructions specified by the offset if register rs is greater than or equal to 0. Save the address of the next instruction in register 31.

1 rs 0x11 offset

Page 92: Skup instrukcija  mikropocesora MIPS

Branch Instructions – cont.

Branch on greater than zero

bgtz rs,label 6 5 5 16

Conditionally branch the number of instructions specified by the offset if register rs is greater than 0.

Branch on less than equal zero

blez rs,label 6 5 5 16

Conditionally branch the number of instructions specified by the offset if register rs is less than or equal to 0.

7 rs 0 offset

6 rs 0 offset

Branch on less than equal zero and link

bltzal rs,label 6 5 5 16

Conditionally branch the number of instructions specified by the offset if register rs is less than 0. Save the address of the next instruction in register 31.

1 rs 0x10 offset

Page 93: Skup instrukcija  mikropocesora MIPS

Branch Instructions – cont.

Branch on less than zero

bltz rs,label 6 5 5 16

Conditionally branch the number of instructions specified by the offset if register rs is less than 0.

Branch on not equal

bne rs,label 6 5 5 16

Conditionally branch the number of instructions specified by the offset if register rs is not equal to rt.

1 rs 0 offset

5 rs rt offset

Branch on equal zero

beqz rsrc,label pseudoinstruction

Conditionally branch to the instructions at the label if rsrc1 equals 0.

Page 94: Skup instrukcija  mikropocesora MIPS

Branch Instructions – cont.

Branch on greater than equal

bge rsrc1,rsrc2,label pseudoinstruction

Branch on greater than equal unsigned

bgeu rsrc1,rsrc2,label pseudoinstruction

Conditionally branch to the instruction at the label if register rsrc1 is greater than or equal to rsrc2.

Branch on greater than

bgt rsrc1,rsrc2,label pseudoinstruction

Branch on greater than unsigned

bgtu rsrc1,rsrc2,label pseudoinstruction

Conditionally branch to the instruction at the label if register rsrc1 is greater than rsrc2.

Page 95: Skup instrukcija  mikropocesora MIPS

Branch Instructions – cont. Branch on less than equal

ble rsrc1,rsrc2,label pseudoinstruction

Branch on less than equal unsigned

bleu rsrc1,rsrc2,label pseudoinstructionConditionally branch to the instruction at the label if register rsrc1 is less than or equal to rsrc2.

Branch on less than

blt rsrc1,rsrc2,label pseudoinstruction

Branch on less than unsigned

bltu rsrc1,rsrc2,label pseudoinstructionConditionally branch to the instruction at the label if register rsrc1 is less than rsrc2. Branch on not equal zero

bnez rsrc,label pseudoinstructionConditionally branch to the instruction at the label if register rsrc is not equal to 0.

Page 96: Skup instrukcija  mikropocesora MIPS

Jump Instructions

Jump

J target 6 26

Unconditionally jump to the instruction at target.

Jump and link

jal target 6 26

Unconditionally jump to the instruction at target. Save the address of the next instruction in register rd.

2 target

2 target

Jump and link register

jalr rs,rd 6 5 5 5 5 6

Unconditionally jump to the instruction whose address is in register rs. Save the address of the next instruction in register rd (which defaults to 31).

0 rs 0 rd 0 9

Page 97: Skup instrukcija  mikropocesora MIPS

Jump Instructions – cont. Jump register

jr rs 6 5 5 16

Unconditionally jump to the instruction whose address is in register rs.

0x0 rs 0 0x8

Load address

la rdest,address pseudoinstructionLoad computed address-not the contents of the location-into register rdest.

Load byte

lb rt,address 6 5 5 16

0x20 rs rt offset

Load unsigned byte

lbu rt,address 6 5 5 16

Load the byte at address into register rt. The byte is sign-extended by lb, but not by lbu.

0x24 rs rt offset

Page 98: Skup instrukcija  mikropocesora MIPS

Load halfword

lh rt,address 6 5 5 16

0x21 rs rt offset

Load unsigned halfword

lhu rt,address 6 5 5 16

Load the 16-bit quantity (halfword) at address into register rt. The halfword is sign-extended by lh, but not by lhu.

0x25 rs rt offset

Jump Instructions – cont.

Load word

lw rt,address 6 5 5 16

Load the 32-bit quantity (word) at address into register rt.

0x23 rs rt offset

Page 99: Skup instrukcija  mikropocesora MIPS

Jump Instructions – cont.

Load word coprocessor

lwcz rt,address 6 5 5 16

Load the word at address into register rt of coprocessor z (0-3). The floating-point unit is z=1.

0x3z rs rt offset

Load word left

lwl rt,address 6 5 5 16

0x22 rs rt offset

Load word right

lwr rt,address 6 5 5 16

Load the left (right) bytes from the word at the possibly unaligned address into register rt.

0x23 rs rt offset

Page 100: Skup instrukcija  mikropocesora MIPS

Jump Instructions – cont. Load doubleword

ld rdest,address pseudoinstruction Load the 64-bit quantity at address into register rdest and rdest + 1.

Unaligned load halfword

ulh rdest,address pseudoinstruction

Unsigned load halfword unaligned

ulhu rdest,address pseudoinstruction Load the 16-bit quantity (halfword) at the possibly unaligned address into register rdest. The halfword is sign-extended by ulh, but not ulhu.

Unsigned load word

ulw rdest,address pseudoinstruction Load the 32-bit quantity (word) at the possibly unaligned address into register rdest.

Page 101: Skup instrukcija  mikropocesora MIPS

Store Instructions Store byte

sb rt,address 6 5 5 16

Store the low byte from register rt at address.

0x28 rs rt 0x8

Store word

sw rt,address 6 5 5 16

Store the word from register rt at address.

0x2b rs rt offset

Store word coprocessor

swcz rt,address 6 5 5 16

Store the word from register rt of coprocessor z at address. The floating point unit is z = 1.

0x3(1-z) rs rt offset

Store halfword

sh rt,address 6 5 5 16

Store the low halfword from register rt at address.

0x29 rs rt 0x8

Page 102: Skup instrukcija  mikropocesora MIPS

Store Instructions – cont. Store word left

swl rt,address 6 5 5 16

0x2a rs rt 0x8

Store doublewordsd rsrc,address pseudoinstruction

Store the 64-bit quantity in registers rsrc and rsrc + 1 address.

Store word right

swr rt,address 6 5 5 16

Store the left (right) bytes from register rt at the possibly unaligned address.

0x29 rs rt 0x8

Unaligned store halfwordush rsrc,address pseudoinstruction Store the low halfword from register rsrc at the possibly unaligned address.

Unaligned store wordusw rsrc,address pseudoinstruction Store the word from register rsrc at the possibly unaligned address.

Page 103: Skup instrukcija  mikropocesora MIPS

Data Movement Instructions Move

move rdest,rsrc pseudoinstructionMove register rsrc to rdest.

Move from hi

mfhi rd 6 10 5 5 6

0 0 rd 0 0x10

Move from lo

mflo rd 6 10 5 5 6

Move the hi (lo) register to register rd.

0 0 rd 0 0x12

Move to hi

mthi 6 5 15 6

0 rs 0 0x11

Page 104: Skup instrukcija  mikropocesora MIPS

Move to lo

mtlo 6 5 15 6

Move register rs to the hi (lo) register.

Data Movement Instructions – cont.

0 rs 0 0x11

Move from coprocessor z

mfcz rt,rd 6 5 5 5 11

Move coprocessor z's register rd to CPU register rt. The floating-point unit is coprocessor z=1.

0x1z 0 rt rd 0

Move double from coprocessor zmfcl,d rdest,rsrc1 pseudoinstructionMove floating-point registers frsrc1 and frsrc1 + 1 to CPU registers rdest and rdest + 1.

Move to coprocessor z

mtcz rt,rd 6 5 5 5 11

Move CPU register rt to coprocessor z's register rd.

0x1z 0 rt rd 0

Page 105: Skup instrukcija  mikropocesora MIPS

Floating-Point Instructions Floating-point absolute value double

abs.d fd,fs 6 5 5 5 5 6

0x11 1 0 fs fd 5

Floating-point absolute value single

abs.s fd,fs 6 5 5 5 5 6

Compute the absolute value of the floating-point double (single) in register fs and put it in register fd.

0x11 1 0 fs fd 5

Floating-point addition double

add.d fd,fs,ft 6 5 5 5 5 6

0x11 1 0 fs fd 5

Floating-point addition single

add.s fd,fs,ft 6 5 5 5 5 6

Compute the sum of the floating-point double (single) in register fs and ft and put it in register fd.

0x11 1 0 fs fd 5

Page 106: Skup instrukcija  mikropocesora MIPS

Floating-Point Instructions – cont. Compare equal double

c.eq.d fs,ft 6 5 5 5 5 2 4

0x11 1 ft fs fd FC 2

Compare equal single

c.eq.s fs,ft 6 5 5 5 5 2 4

Compare the floating-point double in register fs against the one in ft and set the floating-point condition flag true if they are equal.

0x11 0 ft fs fd FC 2

Compare less than equal double

c.le.d fs,ft 6 5 5 5 5 2 4

0x11 1 ft fs fd FC 2

Compare less than equal single

c.le.s fs,ft 6 5 5 5 5 2 4

Compute the floating-point double in register fs against the one in ft and set the floating-point condition flag true if the first is less than or equal to the second. Use the bclt or bclf instructions to test the value of this flag.

0x11 0 ft fs fd FC 2

Page 107: Skup instrukcija  mikropocesora MIPS

Floating-Point Instructions – cont. Compare less than double

c.lt.d fs,ft 6 5 5 5 5 2 4

0x11 1 ft fs 0 FC 2

Compare less than single

c.lt.s fs,ft 6 5 5 5 5 2 4

Compute the floating-point double in register fs against the one in ft and set the condition flag true if the first is less than the second. Use the bclt or bclf instructions to test the value of this flag.

0x11 0 ft fs 0 FC 2

Convert single to double

cvt.d.s fd,fs 6 5 5 5 5 6

0x11 1 0 fs fd 0x21

Convert integer to double

cvt.d.s fd,fs 6 5 5 5 5 6

Convert the single precisioion floating-point number or integer in register fs to a double precision number and put it in register fd.

0x11 1 0 fs fd 0x21

Page 108: Skup instrukcija  mikropocesora MIPS

Floating-Point Instructions – cont. Convert double to single

cvt.s.d fd,fs 6 5 5 5 5 6

0x11 1 0 fs fd 0x20

Convert integer to single

cvt.s.w fd,fs 6 5 5 5 5 6

Convert the double precisioion floating-point number or integer in register fs to a single precision number and put it in register fd.

0x11 0 0 fs fd 0x20

Convert double to integer

cvt.w.d fd,fs 6 5 5 5 5 6

0x11 1 0 fs fd 0x24

Convert single to integer

cvt.w.s fd,fs 6 5 5 5 5 6

Convert the double or single precisioion floating-point number in register fs to an integer and put it in register fd.

0x11 0 0 fs fd 0x24

Page 109: Skup instrukcija  mikropocesora MIPS

Floating-Point Instructions – cont.

Floating-point divide double

div.d fd,fs,ft 6 5 5 5 5 6

0x11 1 ft fs fd 3

Floating-point divide single

div.s fd,fs,ft 6 5 5 5 5 6

Compute the quotient of the floating-point doubles (singles) in registers fs and ft and put it in register fd.

0x11 0 ft fs fd 3

Load floating-point double

l.d fdest,address pseudoinstruction

Load floating-point single

l.s fdest,address pseudoinstruction

Load the floating-point double (single) at address into register fdest.

Page 110: Skup instrukcija  mikropocesora MIPS

Floating-Point Instructions – cont.

Move floating-point double

mov.d fd,fs 6 5 5 5 5 6

0x11 1 0 fs fd 6

Move floating-point single

mov.s fd,fs 6 5 5 5 5 6

Move the floating-point double (single) from register fs to register fd.

0x11 0 0 fs fd 6

Floating-point multiply double

mul.d fd,fs,ft 6 5 5 5 5 6

0x11 1 ft fs fd 2

Floating-point multiply single

mul.s fd,fs,ft 6 5 5 5 5 6

Compute the product of the floating-point dobule (single) in registers fs and ft and put it in register fd.

0x11 0 ft fs fd 2

Page 111: Skup instrukcija  mikropocesora MIPS

Floating-Point Instructions – cont.

Negate double

neg.d fd,fs 6 5 5 5 5 6

0x11 1 ft fs fd 7

Negate single

neg.s fd,fs 6 5 5 5 5 6

Negate the floating-point double (single) in register fs and put it in register fd.

0x11 0 0 fs fd 7

Store floating-point double

s.d fdest,address pseudoinstruction

Store floating-point single

s.s fdest,address pseudoinstructionStore the floating-point double (single) in register fdest at address.

Page 112: Skup instrukcija  mikropocesora MIPS

Floating-Point Instructions – cont.

Floating-point subtract double

sub.d fd,fs,ft 6 5 5 5 5 6

0x11 1 ft fs fd 1

Floating-point subtract single

sub.s fd,fs,ft 6 5 5 5 5 6

Compute the difference of the floating-point double (single) in registers fs and ft and put it in register fd.

0x11 0 ft fs fd 1

Page 113: Skup instrukcija  mikropocesora MIPS

Floating-Point Instructions Return from exception

rfe 6 1 19 6

Restore the Status register.

0x10 1 0 0x20

System call

syscall 6 20 6

Register $v0 contains the number of the system call.

0 0 0xc

Break

break 6 20 6

Cause exception code. Exception 1 is reserved for the debugger.

0 code 0xd

No operation

nop 6 5 5 5 5 6

Do nothing.

0 0 0 0 0 0