Download ppt - RISC ARCHITECTURE BY TEDDY LEE. TOPICS REVIEW OF RISC RISC ARCHITECTURE RISC VS. CISC PA-RISC HP ARCHITECTURE

RISC ARCHITECTUREBY

TEDDY LEE

TOPICS

• REVIEW OF RISC

• RISC ARCHITECTURE

• RISC VS. CISC

• PA-RISC HP ARCHITECTURE

RISC REVIEW

Main Features:• One cycle execution

• Pipelining

• Large Number of Registers

RISC – Reduced Instruction Set Computer

RISC ARCHITECTURE

Features:

• Word Width• Split or common cache• On-chip or off-chip cache• Write buffer• Prefetch buffer• Harvard or Princeton Architecture• Common register file or private registers

FEATURES

• Word Width:Most RISC processors use a 32-bit internal and external word width

• Split or Common Cache:Cache is needed between RISC processor and main memory

FEATURES cont.

• On-Chip or Off-Chip Cache:Different chip designs either increase access time or simplify the design of the integer unit

Example – SPARC chip

•Write Buffer:Accessing data faster

FEATURES cont.

• Prefetch Buffer:Accessing instruction cache faster

•Harvard or Princeton Architecture:The design to access data and instruction cache

Examples:Motorola 88000MIPS R3000

FEATURES cont.

• Common Register File or Private Registers

Common Register – can be accessed by all execution units

Private Registers – works with the execution units

Examples:Motorola 88000IBM RS/6000

RISC VS. CISCMultiplication Example

Let’s find the product of two numbers. One in address location 2:3, and the other in 5:2, and then store it back into 2:3

RISC VS. CISC cont.

CISC Approach:

MULT 2:3, 5:2

Higher Level Language:

int a, b;

a = a * b;

RISC VS. CISC cont.

RISC APPROACH:

LOAD A, 2:3LOAD B, 5:2PROD A, BSTORE 2:3, A

RISC VS. CISC cont.ADVANTAGES

CISCEmphasis on hardware

Includes multi-clock

complex instructions

Memory-to-memory:

“LOAD” and “STORE”

incorporated in instructions

Small code sizes,

high cycles per second

Transistors used for storing

complex instructions

RISCEmphasis on software

Single-clock,

Reduced instruction only

Register to register:

“LOAD” and “STORE”

are independent instructions

Low cycles per second,

large code sizes

Spends more transistors

on memory registers

RISC VS. CISC cont.

CISC APPROACH ANALYSIS: attempts to minimize the number of instructions per program, sacrificing the number of cycles per instruction

RISC APPROACH ANALYSIS: RISC does the opposite, reducing the cycles per instruction at the cost of the number of instructions per program.

The Performance Equation

PA-RISC Architecture from HP

What is PA-RISC?

•Definition: PA-RISC stands for Precision Architecture, Reduced Instruction Set Computer

•Hewlett-Packard was the first computer company to replace their entire CISC machine families with RISC machines.

•Current versions run on the MPE/IX and HP-UX operating systems.

PA-RISC HP cont.

Some Features:

• Machine Instruction Formats

• Registers

• Delayed Branching

• Multiply Instruction

Machine Instruction Format

PA-RISC machines use an instruction set based on 32-bit general-purpose registers

Assembler Code Explanation

LDIL$2000,8 R8:=$2000

LDO32(8),8 R8:=R8 + 32

LDW-12(0,5),21 R21:=memory(R5-12)

STH8,0, (0,21) memory(R21):=R8

LDH-8(0,5),22 R22:=memory(R5-8)

EXTRS22,31,16,22 R22:=sign-extended(R22)

LDO-1(22),22 R22:=R22 - 1

LDW d(s,b),t

STH r,d(s,b)

LD d(b),t

LDIL i,t

EXTRS r,p,len,t

d = displacement

t = target register

i = immediate value

s = space id

r = source register

p = bit position

b = base register

(s,b) = memory address

len = number of

Machine Instruction Format cont.

Arithmetic: ADD@and SUB@.

Branches: B@ as in BL Branch and Link, BV Branch Vectored.

Compare and Branch: C@ as in COMIBF, COMpare Immediate and Branch If False.

Extract: EXTRS for signed and EXTRU for unsigned.

Load: L@ as in LDH load halfword, LDO load offset.

Shift: SH@ as in SH2ADD Shift 2 and Add.

Store: ST@ as in STB Store Byte, STW Store Word.

Registers

The PA-RISC uses 32 general purpose registers

R0 = bit bucket and source of zero value

R1 = target of ADDIL (Add Immediate Literal)

R2 = RP Return Pointer where BL places address and where BV gets it

R23 = fourth parameter of a procedure call

R24 = third parameter of a procedure call

R25 = second parameter of a procedure call

R26 = first parameter of a procedure call

R27 = DP Data Pointer to base of global data

R28-29 = function result in R28 if 32-bits, both if 64-bits

R30 = SP Stack Pointer to parameters and exit data

R31 = receives target branch address in BLE instruction

Registers cont.The PA-RISC systems also have in addition eight 32-bit Space Registers

SR 0 = return address of inter-space procedure calls

SR 1 = Temporary use for constructing long pointers



SR 4 = Code space

SR 5 = process private data: stack and heap

SR 6 = Shared data

SR 7 = System public code, literals, and data

Delayed Branching

• The ideal goal for the PA-RISC architecture is to complete the execution of a useful instruction in each machine cycle. The branch instruction is hard to implement in one cycle.

• Pipelining is used to execute instructions simultaneously, but doing a branch will not work with pipelining.

Problems with Branching

Delayed Branching cont.Solution

• Delay the execution of the branch for one cycle

• Make instructions following after branch ( located in a delay slot) be executed before control passes to the branch destination.

• Let the compiler look for an instruction to put in the delay slot, one that can be executed during the branch operation

Delayed Branching cont.

Example:

BL opencarton ; branch

LDW 26 ... ; load word into register during delay

BL closecarton ; branch

NOP ; code 8000240, actually OR 0,0,0

Delayed Branching cont.Delayed Branching is the same as if we could pack our bags while flying to our

destination:

1. book our flight

2. reserve hotel room

3. reserve rental car

4. fly to destination

5. (pack suitcase during the delay slot)

6. collect baggage

7. get rental car

8. check into hotel

Multiplying Instruction

• Integer multiply and divide not supported by hardware on a PA-RISC

• Find a way to optimize the frequent use of the constants used during multiply tasks during compile time

• Multiply can be converted to a series of additions and smaller multiples

Example:

120 = 10 x 12

120 = (5 x 12) + (5 x 12)

120 = ((4 x 12) + 12) + ((4 x 12) + 12)

Multiplying Instruction cont.Solution

• Use Shift and Add machine instructions to multiply a register by 2, 4, or 8 and add to any register in one cycle.

SH2ADD x,x,x; shift x 2 bits (multiply by 4), add to x, store in x

ADD x,x,x; add register x to itself and store in x

• The compiler can convert a multiplication by a constant into a series of Shift and Add instructions

RISC Systems and Processors

• PA-RISC Systems - http://www.testdrive.hp.com/systems/pa-risc.shtml

• RISC Processor - http://www.atmel.com/products/avr/

• MIPS Technology - http://www.mips.com/

References

1. http://www.robelle.com/library/smugbook/pa-risc.html

2. http://cse.stanford.edu/class/sophomore-college/projects-00/risc/whatis/index.html

3. http://www.inf.fu-berlin.de/lehre/WS94/RA/RISC-9.html

4. Anthony J. Dos Reis, Assembly Language and Computer Architecture Using C++ and Java, (United States: Course Technology, Copyright 2004).