24
Slipstream Processors by Pujan Joshi 1 Pujan Joshi May 6 th , 2008 Slipstream Processors Improving both Performance and Fault Tolerance

Slipstream Processors by Pujan Joshi1 Pujan Joshi May 6 th, 2008 Slipstream Processors Improving both Performance and Fault Tolerance

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Slipstream Processors by Pujan Joshi 1

Pujan Joshi

May 6th, 2008

Slipstream ProcessorsImproving both Performance and Fault

Tolerance

Slipstream Processors by Pujan Joshi 2

Analogy - NASCAR

Slipstream Processors by Pujan Joshi 3

The ParadigmIt is possible to make forward progress by executing only

a subset of the original program.Ineffectual Instructions

Non-modifying writes Unreferenced writes Correctly-predicted branches...also their dependence chains

Slipstream Processors by Pujan Joshi 4

SlipstreamingA slipstream processor creates a shorter instruction stream

by skipping the ineffectual instructions.

Two copies of the same program is run; the full program and the shortened program.

Slipstream Processors by Pujan Joshi 5

Slipstream ExecutionThe shortened program is speculatively reduced and runs

slightly ahead of the other.

The leading program is called the advanced stream (A-stream) and the trailing program is called the redundant stream (R-stream).

Slipstream Processors by Pujan Joshi 6

Advanced StreamA-stream

Executes fewer instructionNeeds some hardware supportResults are communicated to R-stream

Slipstream Processors by Pujan Joshi 7

Redundant StreamR-stream

Executes the whole programExecutes efficiently as it receives control and data output as

predictions from A-streamcompares the values against its own outcomes, if a deviation

is detected, the corrupted A-stream context is recovered from the R-stream context.

Slipstream Processors by Pujan Joshi 8

InterpretationA-stream: a program based predictor

R-stream: a fast checker

A-stream

R-stream

checker

Slipstream Processors by Pujan Joshi 9

Slipstream Architecture

Slipstream Processors by Pujan Joshi 10

Each is a conventional superscalar processor with a

branch predictor, instruction and data caches, and an execution engine and a reorder buffer.

New Components AddedInstruction Removal predictorInstruction Removal detectorDelay bufferRecovery controller

Micro-Architecture

Slipstream Processors by Pujan Joshi 11

The Instruction R emovalDetector

Monitors the R-stream.Checks for ineffective instructions.Checks for correctly predicted branches.Conveys the IR-predictor about the instructions that can

be skipped.

Slipstream Processors by Pujan Joshi 12

Removal MechanismRemoves the confident branch predictions and

computations needed for them.Removes highly value-predictable computations. Ineffectual and branch predictable instructions removed

and PC is updated.Computations replaced by the value.

Slipstream Processors by Pujan Joshi 13

Instruction Removal Predictor

• Generates PC for next block of instructions.• Removes the instructions suggested by the IR-

detector in the fetch block.

Strategy:• Built on top of conventional trace-predictor.• Added few more information like IR-vector,

Intermediate PC & Confidence Counter.

Slipstream Processors by Pujan Joshi 14

Continued..Confidence Counters:IR-detector updates this counter.Corresponding instruction is removed if the counter is

saturated.IR-vector and intermediate PCs used to remove

instruction.

Slipstream Processors by Pujan Joshi 15

Delay BufferFIFO Queue.Control flow – trace ids & IR-vectorData flow – operand register names, values & LS

addresses.A-stream enqueues & R-stream dequeues.

Slipstream Processors by Pujan Joshi 16

Recovery ControllerA-stream context can be corrupted.Maintains the address where the store instructions are

writing.Recovers the corrupted A-stream context from R-stream

(Both register and memory values).Delay Buffer flushed.PCs of A-stream restored.IR-predictor backed up to precise program counter.Entire register file copied via delay buffer.

Slipstream Processors by Pujan Joshi 17

Simulation Environment

Slipstream Processors by Pujan Joshi 18

Slipstream Performance

Slipstream Processors by Pujan Joshi 19

Doubling superscalar complexity

Slipstream Processors by Pujan Joshi 20

Results7% average performance improvement is achieved by

harnessing an otherwise unused, additional processor in a Single Chip Multi-Processor (CMP).

The performance improvement due to doubling the window size and issue bandwidth of the superscalar processor is on average of 28%.

Slipstream Processors by Pujan Joshi 21

AdvantagesExploiting existing, otherwise unused

processor in a CMPspeeds up a single programCompetitive with superscalar

1/4 speedup of larger superscalar. (Improved slipstream design performs comparably or better: MICRO-33.)

Slipstream Processors by Pujan Joshi 22

Related Work/MotivationA-Stream/R-Stream Simultaneous Multithreading.

Reduced programs with same output.

Slipstream Processors by Pujan Joshi 23

CMP/SMT: throughput and parallel program performance.

Slipstream: improved single-program performance and reliability.

AR-SMT / SRT: high reliability with little performance overhead.

More Flexible Architecture

Slipstream Processors by Pujan Joshi 24

References“Slipstream Processors: Improving both Performance and Fault

Tolerance”, Karthik Sundaramoorthy, Zach Purser, Eric Rotenberg.

“A Study of Slipstream Processors”, Zach Purser, Karthik Sundaramoorthy, Eric Rotenberg.

“A Simple Mechanism for Detecting Ineffectual Instructions in Slipstream Processors”, Jinson J. Koppanalil and Eric Rotenberg.