1 Sequential CPU Implementation. 2 Outline Logic design Organizing Processing into Stages SEQ timing...

Preview:

Citation preview

1

Sequential CPU Implementation

2

Outline

• Logic design

• Organizing Processing into Stages

• SEQ timing

• Suggested Reading 4.2,4.3.1 ~ 4.3.3

OFZFSF

OFZFSF

OFZFSF

OFZFSF

Arithmetic Logic Unit

• Combinational logic– Continuously responding to inputs

• Control signal selects function computed– Corresponding to 4 arithmetic/logical operations in Y86

• Also computes values for condition codes• We will use it as a basic component for our CPU

ALU

Y

X

X + Y

0

ALU

Y

X

X - Y

1

ALU

Y

X

X & Y

2

ALU

Y

X

X ^ Y

3

A

B

A

B

A

B

A

B

3

Storage

• Registers– Hold single words or bits– Loaded as clock rises– Not only program registers

I O

Clock

4

Storing 1 Bit

5

V1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Vin

V1

Bistable ElementQ+

Q–

q

!q

q = 0 or 1

Vin V1

V2

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Vin

V1

V2

Storing 1 Bit (cont.)

60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Vin

V1

V2

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Vin

Vin

V2

Bistable ElementQ+

Q–

q

!q

q = 0 or 1

Vin V1

V2V2

Vin V1

Vin = V2

Stable 0

Stable 1

Metastable

Physical Analogy

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Vin

V1

V2

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Vin

Vin

V2

Stable 0

Stable 1

Metastable

.Stable left . Stable right.

Metastable

7

Storing and Accessing 1 Bit

Q+

Q–

R

S

R-S Latch

Q+

Q–

R

S

Q+

Q–

R

S

Resetting1

0

1 0

0 1

Q+

Q–

R

S

Q+

Q–

R

S

Setting0

1

0 1

1 0

Q+

Q–

R

S

Q+

Q–

R

S

Storing0

0

!q q

q !q

Bistable Element

Q+

Q–

q

!q

q = 0 or 1

8

1-Bit Latch

Latching

1

Q+

Q–

R

S

D

C

Q+

Q–

R

S

D

C

d !d !d !d d

d d !d0

Storing

Q+

Q–

R

S

D

C

Q+

Q–

R

S

D

C

d !d q

!q

!q

q0

0

9

D Latch

Q+

Q–

R

S

D

C

Data

Clock

Transparent 1-Bit Latch

– When in latching mode, combinational propogation from D to Q+ and Q–

– Value latched depends on value of D as C falls

Latching

1

Q+

Q–

R

S

D

C

Q+

Q–

R

S

D

C

d !d !d !d d

d d !d

C

D

Q+

Time

Changing D

10

Edge-Triggered Latch

– Only in latching mode for brief periodRising clock edge

– Value latched depends on data as clock rises

– Output remains stable at all other times

Q+

Q–

R

S

D

C

Data

Clock TTrigger

C

D

Q+

Time

T

11

Registers

– Stores word of dataDifferent from program registers seen in

assembly code

– Collection of edge-triggered latches– Loads input on rising edge of clock

I O

Clock

DC

Q+

DC

Q+

DC

Q+

DC

Q+

DC

Q+

DC

Q+

DC

Q+

DC

Q+

i7

i6

i5

i4

i3

i2

i1

i0

o7

o6

o5

o4

o3

o2

o1

o0

Clock

Structure

12

Register Operation

• Stores data bits• For most of time acts as barrier between

input and output• As clock rises, loads input

State = x Risingclock

Output = xInput = y

x

State = y

Output = y

y

13

State Machine Example

Comb. Logic

ALU

0

Out

MUX

0

1

Clock

In

Load

0

14

State Machine Example

15

Comb. Logic

ALU

0

OutMUX0

1

Clock

In

Load

x0

1

x0 ?

?

x0

Clock

Load

In

Out

State Machine Example

16

Comb. Logic

ALU

0

OutMUX0

1

Clock

In

Load

x0

0

x0+x0 x0

X0

x0

Clock

Load

In

Out x0

State Machine Example

17

Comb. Logic

ALU

0

OutMUX0

1

Clock

In

Load

x1

0

x0+x1 x0

X0

x0

Clock

Load

In

Out x0

x1

State Machine Example

18

Comb. Logic

ALU

0

OutMUX0

1

Clock

In

Load 0

x0

Clock

Load

In

Out x0

x1x2 x3 x4 x5

x0+x1 x0+x1+x2

– Accumulator circuit

– Load or accumulate on each cycle

x3 x+x4 x3+x4+x5

Building Blocks

• Combinational Logic– Compute Boolean

functions of inputs– Continuously respond to

input changes– Operate on data and

implement control

• Storage Elements– Store bits– Addressable memories– Non-addressable registers– Loaded only as clock rises

Registerfile

Registerfile

A

B

W dstW

srcA

valA

srcB

valB

valW

Clock

ALU

fun

A

B

MUX

0

1

=

Clock19

Summary

• Storage– Registers

• Hold single words• Loaded as clock rises

– Random-access memories• Hold multiple words• Possible multiple read or write ports• Read word when address input changes• Write word as clock rises

20

21

Instruction Decoding

• Instruction Format– Instruction byte icode:ifun– Optional register byte rA:rB– Optional constant word valC

5 0 rA rB D

icodeifun

rArB

valC

Optional Optional

Y86 Instruction Set #1Byte 0 1 2 3 4 5

pushl rA A 0 rA F

jXX Dest 7 fn Dest

popl rA B 0 rA F

call Dest 8 0 Dest

cmovXX rA, rB 2 fn rA rB

irmovl V, rB 3 0 F rB V

rmmovl rA, D(rB) 4 0 rA rB D

mrmovl D(rB), rA 5 0 rA rB D

OPl rA, rB 6 fn rA rB

ret 9 0

nop 1 0

halt 0 0

rrmovl 2 0

cmovle 2 1

cmovl 2 2

cmove 2 3

cmovne 2 4

cmovge 2 5

cmovg 2 6

22

Y86 Instruction Set #2Byte 0 1 2 3 4 5

pushl rA A 0 rA F

jXX Dest 7 fn Dest

popl rA B 0 rA F

call Dest 8 0 Dest

cmovXX rA, rB 2 fn rA rB

irmovl V, rB 3 0 F rB V

rmmovl rA, D(rB) 4 0 rA rB D

mrmovl D(rB), rA 5 0 rA rB D

OPl rA, rB 6 fn rA rB

ret 9 0

nop 1 0

halt 0 0 addl 6 0

subl 6 1

andl 6 2

xorl 6 3

23

Y86 Instruction Set #3Byte 0 1 2 3 4 5

pushl rA A 0 rA F

jXX Dest 7 fn Dest

popl rA B 0 rA F

call Dest 8 0 Dest

rrmovl rA, rB 2 fn rA rB

irmovl V, rB 3 0 F rB V

rmmovl rA, D(rB) 4 0 rA rB D

mrmovl D(rB), rA 5 0 rA rB D

OPl rA, rB 6 fn rA rB

ret 9 0

nop 1 0

halt 0 0

jmp 7 0

jle 7 1

jl 7 2

je 7 3

jne 7 4

jge 7 5

jg 7 624

25

Instruction Execution Stages

• Fetch– Read instruction from instruction memory

• Decode– Read program registers

• Execute– Compute value or address

• Memory– Read or write data

• Write Back– Write program registers

• PC– Update program counter

26

Executing Arith./Logical Operation

• Fetch– Read 2 bytes

• Decode– Read operand

registers

• Execute– Perform operation– Set condition codes

• Memory– Do nothing

• Write back– Update register

• PC Update– Increment PC by 2

OPl rA, rB 6 fn rA rB

Stage Computation: Arith/Log. Ops

OPl rA, rB

icode:ifun M1[PC]

rA:rB M1[PC+1]

valP PC+2

Fetch

Read instruction byte

Read register byte

Compute next PC

valA R[rA]

valB R[rB]Decode

Read operand A

Read operand B

valE valB OP valA

Set CCExecute

Perform ALU operation

Set condition code register Memory

R[rB] valE

Write

back

Write back result

PC valPPC update Update PC

27

28

SEQ Hardware Structure

• Instruction Flow– Read instruction at address specified by PC– Process through stages– Update program counter

29

InstructionInstruction memory PC increment

CCCC ALU

Data memory

Fetch

Decode

Execute

Memory

Write back

icode ifunrA rB

RegisterM

valP

srcA, srcB

dstB

valA, valB

aluA,aluB

valE

PC

valE

,

newPCPC

A B Register FileE

30

Stage Computation

• Formulate instruction execution as sequence of simple steps

• Use same general form for all instructions

31

Executing rrmovl

• Fetch– Read 2 bytes

• Decode– Read operand

register rA

• Execute– Do nothing

• Memory– Do nothing

• Write back– Update register

• PC Update– Increment PC by 2

rrmovl rA, rB 2 0 rA rB

32

Stage Computation: rrmovl

rrmovl rA, rB

icode:ifun M1[PC]

rA:rB M1[PC+1]

valP PC+2

Fetch

Read instruction byte

Read register byte

Compute next PC

valA R[rA]Decode

Read operand A

valE 0 + valAExecute

Perform ALU operation

Memory R[rB] valE

Write

back

Write back result

PC valPPC update Update PC

33

Executing irmovl

• Fetch– Read 6 bytes

• Decode– Do nothing

• Execute– Do nothing

• Memory– Do nothing

• Write back– Update register

• PC Update– Increment PC by 6

irmovl V, rB 3 0 F rB V

34

Stage Computation: irmovl

irmovl rA, rB

icode:ifun M1[PC]

rA:rB M1[PC+1]

valC M4[PC+2]valP PC+6

Fetch

Read instruction byte

Read register byte

Read constant value

Compute next PC

Decode

valE 0 + valCExecute

Perform ALU operation

Memory R[rB] valE

Write

back

Write back result

PC valPPC update Update PC

35

Executing rmmovl

• Fetch– Read 6 bytes

• Decode– Read operand

registers

• Execute– Compute effective

address

• Memory– Write to memory

• Write back– Do nothing

• PC Update– Increment PC by 6

rmmovl rA, D(rB) 4 0 rA rB D

Stage Computation: rmmovl

– Use ALU for address computation

rmmovl rA, D(rB)

icode:ifun M1[PC]

rA:rB M1[PC+1]

valC M4[PC+2]valP PC+6

Fetch

Read instruction byte

Read register byte

Read displacement D

Compute next PC

valA R[rA]

valB R[rB]Decode

Read operand A

Read operand B

valE valB + valCExecute

Compute effective address

M4[valE] valAMemory Write value to memory

Write

back

PC valPPC update Update PC

36

37

Executing mrmovl

• Fetch– Read 6 bytes

• Decode– Read operand

registers rB

• Execute– Compute effective

address

• Memory– Read from memory

• Write back– Update register rA

• PC Update– Increment PC by 6

mrmovl D(rB),rA 5 0 rA rB D

38

Stage Computation: mrmovl

• Use ALU for address computation

mrmovl D(rB) , rAicode:ifun M1[PC]rA:rB M1[PC+1]valC M4[PC+2]valP PC+6

Fetch

Read instruction byteRead register byteRead displacement DCompute next PC

valB R[rB]Decode

Read operand BvalE valB + valCExecute Compute effective

address

valM M4[valE]Memory Read data from memory

R[rA] valM

Write

back Update register rAPC valPPC update Update PC

39

Executing pushl

• Fetch– Read 2 bytes

• Decode– Read stack pointer

and rA

• Execute– Decrement stack

pointer by 4

• Memory– Store valA at the

address of new stack pointer

• Write back– Update stack

pointer

• PC Update– Increment PC by 2

pushl rA a 0 rA F

40

Stage Computation: pushl

• Use ALU to Decrement stack pointer

pushl rA

icode:ifun M1[PC]rA:rB M1[PC+1] valP PC+2

Fetch

Read instruction byteRead register byte Compute next PC

valA R[rA]valB R [%esp]

DecodeRead valARead stack pointer

valE valB + (-4)Execute

Decrement stack pointer

M4[valE] valAMemory Store to stack R[%esp] valEWrite

back

Update stack pointer

PC valPPC update Update PC

41

Executing popl

• Fetch– Read 2 bytes

• Decode– Read stack pointer

• Execute– Increment stack

pointer by 4

• Memory– Read from old stack

pointer

• Write back– Update stack pointer– Write result to register

• PC Update– Increment PC by 2

popl rA b 0 rA F

Stage Computation: popl

popl rA

icode:ifun M1[PC]

rA:rB M1[PC+1]

valP PC+2

Fetch

Read instruction byte

Read register byte

Compute next PC

valA R[%esp]

valB R [%esp]Decode

Read stack pointer

Read stack pointer

valE valB + 4Execute

Increment stack pointer

valM M4[valA]Memory Read from stack

R[%esp] valE

R[rA] valM

Write

back

Update stack pointer

Write back result

PC valPPC update Update PC

42

43

Stage Computation: popl

• Use ALU to increment stack pointer• Must update two registers

– Popped value– New stack pointer

44

Executing Jumps

jXX Dest 7 fn Dest

XX XXfall thru:

XX XXtarget:

Not taken

Taken

45

Executing Jumps

• Fetch– Read 5 bytes– Increment PC by 5

• Decode– Do nothing

• Execute– Determine whether

to take branch based on jump condition and condition codes

• Memory– Do nothing

• Write back– Do nothing

• PC Update– Set PC to Dest if

branch taken or to incremented PC if not branch

Stage Computation: Jumps

jXX Dest

icode:ifun M1[PC]

valC M4[PC+1]valP PC+5

Fetch

Read instruction byte

Read destination addressFall through address

Decode

Cnd Cond(CC,ifun)Execute

Take branch?

Memory

Write

back

PC Cnd ? valC : valPPC update Update PC

46

47

Stage Computation: Jumps

• Compute both addresses• Choose based on setting of condition codes

and branch condition

48

Executing call

call Dest 8 0 Dest

XXXXreturn:

XXXXtarget:

49

Executing call

• Fetch– Read 5 bytes– Increment PC by 5

• Decode– Read stack pointer

• Execute– Decrement stack

pointer by 4

• Memory– Write incremented

PC to new value of stack pointer

• Write back– Update stack

pointer

• PC Update– Set PC to Dest

Stage Computation: call

call Dest

icode:ifun M1[PC]

valC M4[PC+1]valP PC+5

Fetch

Read instruction byte

Read destination address Compute return point

valB R[%esp]Decode

Read stack pointer

valE valB + –4Execute

Decrement stack pointer

M4[valE] valP Memory Write return value on stack R[%esp] valE

Write

backUpdate stack pointer

PC valCPC update Set PC to destination

50

51

Stage Computation: call

• Use ALU to decrement stack pointer• Store incremented PC

52

Executing ret

ret 9 0

XXXXreturn:

53

Executing ret

• Fetch– Read 1 byte

• Decode– Read stack pointer

• Execute– Increment stack

pointer by 4

• Memory– Read return address

from old stack pointer

• Write back– Update stack pointer

• PC Update– Set PC to return

address

Stage Computation: ret

ret

icode:ifun M1[PC]

Fetch

Read instruction byte

valA R[%esp]

valB R[%esp]Decode

Read operand stack pointerRead operand stack pointervalE valB + 4

ExecuteIncrement stack pointer

valM M4[valA] Memory Read return address

R[%esp] valE

Write

back

Update stack pointer

PC valMPC update Set PC to return

address

54

55

Stage Computation: ret

• Use ALU to increment stack pointer• Read return address from memory

Next

• SEQ Implementation• Suggested Reading 4.3.1, 4.3.4

56

Recommended