38
Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization and Design, 4 th Edition, by Patterson and Hennessey, and were used with permission from Morgan Kaufmann Publishers.

Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Embed Size (px)

Citation preview

Page 1: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Pipelined Datapath and Control

(Lecture #13)

ECE 445 – Computer Organization

The slides included herein were taken from the materials accompanying Computer Organization and Design, 4th Edition, by Patterson and Hennessey,

and were used with permission from Morgan Kaufmann Publishers.

Page 2: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 2

Material to be covered ...

Chapter 4: Sections 5 – 9, 13 – 14

Page 3: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 3

Performance of the Single-Cycle MIPS

Page 4: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 4

Page 5: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 5

Example: MIPS Clock Rate

Determine the clock rate for the MIPS architecture, assuming the following:

The MIPS is a Single Cycle Machine 1 clock cycle per instruction CPI = 1

Access time for memory units = 200 ps Operation time for ALU and adders = 100 ps Access time for register file = 50 ps

Page 6: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 6

Example: MIPS Clock Rate

Instruction Class Functional Units used by the Instruction Class

ALU Instruction Inst. Fetch Register ALU Register

Load Word Inst. Fetch Register ALU Memory Register

Store Word Inst. Fetch Register ALU Memory

Branch Inst. Fetch Register ALU

Jump Inst. Fetch

Page 7: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 7

Example: MIPS Clock Rate

Instruction Class Instr Memory

Register read

ALU operation

Data Memory

Register write

Total

ALU Instruction 200 50 100 0 50 400 ps

Load Word 200 50 100 200 50 600 ps

Store Word 200 50 100 200 0 550 ps

Branch 200 50 100 0 0 350 ps

Jump 200 0 0 0 0 200 ps

Page 8: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 8

Example: MIPS Clock Rate

The clock cycle time for a machine with a single clock cycle per instruction will be determined by the longest instruction.

In this example, the load word instruction requires 600 ps.

The clock rate is then

Clock rate = 1 / Clock Cycle Time

Clock rate = 1 / 600 ps = 1.67 GHz

Page 9: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 9

Performance Issues Longest delay determines clock period

Critical path: load word (lw) instruction Instruction memory register file ALU data

memory register file Not feasible to vary clock period for different

instructions Violates design principle

Making the common case fast Improve performance by pipelining

Page 10: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 10

How does pipelining work?

Page 11: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 11

Pipelining Analogy Pipelined laundry: overlapping execution

Parallelism improves performance

§4.5 An O

verview of P

ipelining Four loads: Speedup

= 8/3.5 = 2.3

Non-stop: Speedup

= 2n/0.5n + 1.5 ≈ 4= number of stages

Page 12: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 12

Objective:

Keep all stages of the pipeline busy at all times.

Page 13: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 13

Pipelining: Improving Performance

Latency Max. Throughput

Non-Pipelined 2 hours 0.5

Pipelined 2 hours 2

Latency = time from start of one load to the end of same load.

Maximum Throughput = # of loads completed per hour.

Assuming all stages of pipeline are busy at all times.Length of time for each

load does not change.

Page 14: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 14

Pipelining: Improving Performance

Pipelining improves performance by increasing instruction throughput, rather than decreasing

execution time of an individual instruction.

Page 15: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 15

The MIPS Pipeline

Page 16: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 16

MIPS Pipeline

Five stages, one step per stage– IF : Instruction fetch from memory– ID : Instruction decode & register read– EX : Execute operation or calculate address– MEM : Access memory operand– WB : Write result back to register

Page 17: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 17

MIPS Pipeline

Page 18: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 18

Pipeline Performance Assume time for stages is

100ps for register read or write 200ps for other stages

Compare pipelined datapath with single-cycle datapath

Instr Instr fetch Register read

ALU op Memory access

Register write

Total time

lw 200ps 100 ps 200ps 200ps 100 ps 800ps

sw 200ps 100 ps 200ps 200ps 700ps

R-format 200ps 100 ps 200ps 100 ps 600ps

beq 200ps 100 ps 200ps 500ps

Page 19: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 19

Pipeline PerformanceSingle-cycle (Tc= 800ps)

Pipelined (Tc= 200ps)

Why is the clock period 800ps?

Why is the clock period 200ps?

Page 20: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 20

Pipeline Speedup

If all stages are balanced i.e., all take the same time

Time between instructionspipelined

= Time between instructionsnonpipelined

Number of stages If not balanced, speedup is less Speedup due to increased throughput

Latency (time for each instruction) does not decrease

Page 21: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 21

Pipelining and ISA Design MIPS ISA designed for pipelining

All instructions are 32-bits Easier to fetch and decode in one cycle c.f. x86: 1- to 17-byte instructions

Few and regular instruction formats Can decode and read registers in one step

Load/store addressing Can calculate address in 3rd stage, access memory in 4th

stage Alignment of memory operands

i.e. on word boundaries Memory access takes only one cycle

Page 22: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 22

Pipeline Summary

Pipelining improves performance by increasing instruction throughput Executes multiple instructions in parallel Each instruction has the same latency

Subject to hazards Structure, data, control

Instruction set design affects complexity of pipeline implementation

The BIG Picture

hazards will be discussed in upcoming lectures

Page 23: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 23

MIPS Pipelined Datapath§4.6 P

ipelined Datapath and C

ontrol

Page 24: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 24

Pipeline registers Need registers between stages

To hold information produced in previous cycle

Why?

Page 25: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 25

Pipeline Operation

Cycle-by-cycle flow of instructions through the pipelined datapath “Single-clock-cycle” pipeline diagram

Shows pipeline usage in a single cycle Highlight resources used

“Multi-clock-cycle” diagram Graph of operation over time

We’ll look at “single-clock-cycle” diagrams for load word and store word.

Page 26: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 26

IF for Load, Store, …

Page 27: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 27

ID for Load, Store, …

Page 28: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 28

EX for Load

Page 29: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 29

MEM for Load

Page 30: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 30

WB for Load

Wrongregisternumber

Why?

Page 31: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 31

Corrected Datapath for Load

Page 32: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 32

EX for Store

Page 33: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 33

MEM for Store

Page 34: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 34

WB for Store

Page 35: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 35

Multi-Cycle Pipeline Diagram Form showing resource usage

Page 36: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 36

Multi-Cycle Pipeline Diagram Traditional form

Page 37: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 37

Single-Cycle Pipeline Diagram State of pipeline in a given cycle

Page 38: Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer

Fall 2010 ECE 445 - Computer Organization 38

Questions?