Flynn's Classification - Asst. Prof. Vincy Joseph Web viewModule . V. I. Explain Flynn’s Classification (05Marks) Explain Flynn’s Classification in detail(10Marks) ... 04/11/2017

COA Question Bank Ms. Vincy Joseph

Class: SE CMPNA&B

Subject: Computer Architecture and Organization

Module VI

1. Explain Flynn’s Classification (05Marks)2. Explain Flynn’s Classification in detail(10Marks)

Flynn's Classification

There are different ways to classify parallel computers. One of the more widely used classifications, in use since 1966, is

called Flynn's Taxonomy. Flynn's taxonomy distinguishes multi-processor computer

architectures according to how they can be classified along the two independent dimensions of Instruction Stream and Data Stream. Each of these dimensions can have only one of two possible states: Single or Multiple.

The matrix below defines the 4 possible classifications according to Flynn:

Single Instruction, Single Data (SISD):

A serial (non-parallel) computer


Single Instruction: Only one instruction stream is being acted on by the CPU during any one clock cycle

Single Data: Only one data stream is being used as input during any one clock cycle

Deterministic execution This is the oldest type of computer Examples: older generation mainframes, minicomputers,

workstations and single processor/core PCs.

Examples


Single Instruction, Multiple Data (SIMD):

Single Instruction: All processing units execute the same instruction at any given clock cycle

Multiple Data: Each processing unit can operate on a different data element

Best suited for specialized problems characterized by a high degree of regularity, such as graphics/image processing.

Two varieties: Processor Arrays and Vector Pipelines

Processor Arrays


Vector Pipelines

Examples:

Processor Arrays: Thinking Machines CM-2, MasPar MP-1 & MP-2, ILLIAC IV


Vector Pipelines: IBM 9000, Cray X-MP, Y-MP & C90, Fujitsu VP, NEC SX-2, Hitachi S820, ETA10


Most modern computers, particularly those with graphics processor units (GPUs) employ SIMD instructions and execution units.

Multiple Instruction, Single Data (MISD):

Multiple Instruction: Each processing unit operates on the data independently via separate instruction streams.

Single Data: A single data stream is fed into multiple processing units.

Few (if any) actual examples of this class of parallel computer have ever existed.

Some conceivable uses might be:o multiple frequency filters operating on a single signal streamo multiple cryptography algorithms attempting to crack a single

coded message.

Multiple Instruction, Multiple Data (MIMD):

Multiple Instruction: Every processor may be executing a different instruction stream

Multiple Data: Every processor may be working with a different data stream

Execution can be synchronous or asynchronous, deterministic or non-deterministic

Currently, the most common type of parallel computer - most modern supercomputers fall into this category.


Examples: most current supercomputers, networked parallel computer clusters and "grids", multi-processor SMP computers, multi-core PCs.

Note: many MIMD architectures also include SIMD execution sub-components

Examples:


Types of MIMD

Shared Memory

Shared memory parallel computers vary widely, but generally have in common the ability for all processors to access all memory as global address space.

Multiple processors can operate independently but share the same memory resources.

Changes in a memory location effected by one processor are visible to all other processors.


Distributed Memory

Distributed memory systems require a communication network to connect inter-processor memory.

Processors have their own local memory. Memory addresses in one processor do not map to another processor, so there is no concept of global address space across all processors.

Because each processor has its own local memory, it operates independently. Changes it makes to its local memory have no effect on the memory of other processors. Hence, the concept of cache coherency does not apply.

When a processor needs access to data in another processor, it is usually the task of the programmer to explicitly define how and when data is communicated. Synchronization between tasks is likewise the programmer's responsibility.

Advantages:

Memory is scalable with the number of processors. Increase the number of processors and the size of memory increases proportionately.

Each processor can rapidly access its own memory without interference and without the overhead incurred with trying to maintain global cache coherency.

Cost effectiveness: can use commodity, off-the-shelf processors and networking.

Disadvantages:

The programmer is responsible for many of the details associated with data communication between processors.


It may be difficult to map existing data structures, based on global memory, to this memory organization.

Non-uniform memory access times - data residing on a remote node takes longer to access than node local data.

3. What is instruction pipelining? What are the advantages of pipelining? (06Marks)

4. Explain 6 stage instruction pipeline with suitable diagram (10Marks)Pipelining It is a technique of decomposing a sequential process into sub-operations

with each sub-process being executed in a special dedicated segment that operates concurrently with all other segments

It improves processor performance by overlapping the execution of multiple instructions

Instruction Pipelining

• The instruction processing is divided into following 6 stages:

– Fetch Instruction(FI): Read next instruction to buffer

– Decode Instruction(DI):Determine opcode and operand specifiers

– Calculate Operands (CO): Calculate address of source operands

– Fetch Operands(FO): Fetch operands from memory

– Execute Instructions(EI): Perform the indicated operation

– Write Operand(WO): Store result in memory

State Diagram


Limitations of 6-stage pipeline

• Assumes that each instruction goes thru all 6 stages. This will not always true. E.g.LOAD does not require WO stage.

• Assumes that all stages can be performed in parallel and there are no memory conflicts. However FI,FO and WO can occur simultaneously and most memory systems does not permit that.

• If six stages are not of equal duration, there will be some waiting time involved at various pipeline stages.

• The conditional branch instruction and interrupt can invalidate several instruction fetches.

• Register conflict and Memory conflict


5. Explain various pipeline hazards with example(05Marks)6. Write short note on pipeline hazards(07Marks)

Pipeline Hazards

A pipeline hazard occurs when the pipeline must stall because some conditions do not permit continued execution. Such a pipeline stall is also referred to as a pipeline bubble. There are three types of hazards: resource, data and control.

Resource Hazard (Structural Hazard)

A resource hazard occurs when two or more instructions in the pipeline need the same resource.


Data Hazards

A data hazard occurs when there is a conflict in the access of an operand location.

Hazards are caused by resource usage conflicts among various instructions

They are triggered by inter-instruction dependencies

Terminologies:

• Resource Objects: set of working registers, memory locations and special flags

• Data Objects: Content of resource objects

• Each Instruction can be considered as a mapping from a set of data objects to a set of data objects.

• Domain D(I) : set of resource of objects whose data objects may affect the execution of instruction I.(e.g.Source Registers)

• Range R(I): set of resource objects whose data objects may be modified by the execution of instruction I .(e.g. Destination Register)

• Instruction reads from its domain and writes in its range

Consider execution of instructions I and J, and J appears immediately after I.

There are 3 types of data dependent hazards:

1. RAW (Read After Write)

2. WAW(Write After Write)

3. WAR (Write After Read)

RAW (Read After Write)

The necessary condition for this hazard is

R( I )∩D (J )≠φ


• Example:

I1 : LOAD r1,a

I2 : ADD r2,r1

• I2 cannot be correctly executed until r1 is loaded

• Thus I2 is RAW dependent on I1

WAW(Write After Write)

• The necessary condition is

•

• Example

I1 : MUL r1, r2

I2 : ADD r1,r4

R( I )∩R (J )≠φ


• Here I1 and I2 writes to same destination and hence they are said to be WAW dependent.

WAR(Write After Read)

• The necessary condition is

• Example:

• I1 : MUL r1,r2

• I2 : ADD r2,r3

• Here I2 has r2 as destination while I1 uses it as source and hence they are WAR dependent

• Hazards can be detected in fetch stage by comparing domain and range.

• Once detected, there are two methods:

1. Generate a warning signal to prevent hazard

2. Allow incoming instruction through pipe and distribute detection to all pipeline stages.

Control HazardsInstructions that disrupt the sequential flow of control present problems for pipelines. The effects of these instructions can't be exactly determined until late in the pipeline, so instruction fetch can't continue unless we do something special. The following types of instructions can introduce control hazards:

D( I )∩R (J )≠φ


Unconditional branches. Conditional branches. Indirect branches. Procedure calls. Procedure returns.

Solutions for Control Hazards

The following are solutions that have been proposed for mitigating aspects of control hazards:

Pipeline stall cycles. Freeze the pipeline until the branch outcome and target are known, then proceed with fetch. Thus, every branch instruction incurs a penalty equal to the number of stall cycles. This solution is unsatisfactory if the instruction mix contains many branch instructions, and/or the pipeline is very deep.

Branch delay slots. The ISA is constructed such that one or more instructions sequentially following a conditional branch instruction are executed whether or not the branch is taken. The compiler or assembly language writer must fill these branch delay slots with useful instructions or NOPs (no-operation opcodes). This solution doesn't extend well to deeper pipelines, and becomes architectural baggage that the ISA must carry into future implementations.

Branch prediction. The outcome and target of conditional branches are predicted using some heuristic. Instructions are speculatively fetched and executed down the predicted path, but results are not written back to the register file until the branch is executed and the prediction is verified. When a branch is predicted, the processor enters a speculative mode in which results are written to another register file that mirrors the architected register file. Another pipeline stage called the commit stage is introduced to handle writing verified speculatively obtained results back into the "real" register file. Branch predictors can't be 100% accurate, so there is still a penalty for branches that is based on the branch misprediction rate.

Indirect branch prediction. Branches such as virtual method calls, computed gotos and jumps through tables of pointers can be predicted using various techniques.

Return address stack (RAS). Procedure returns are a form of indirect jump that can be perfectly predicted with a stack as long as the call depth doesn't exceed the stack depth. Return addresses are pushed onto the stack at a call and popped off at a return.


Documents

Flynn's Classification - Asst. Prof. Vincy Joseph Web viewModule . V. I. Explain Flynn’s Classification (05Marks) Explain Flynn’s Classification in detail(10Marks) ... 04/11/2017