RISC ARCHITECTUREBY
TEDDY LEE
TOPICS
• REVIEW OF RISC
• RISC ARCHITECTURE
• RISC VS. CISC
• PA-RISC HP ARCHITECTURE
RISC REVIEW
Main Features:• One cycle execution
• Pipelining
• Large Number of Registers
RISC – Reduced Instruction Set Computer
RISC ARCHITECTURE
Features:
• Word Width• Split or common cache• On-chip or off-chip cache• Write buffer• Prefetch buffer• Harvard or Princeton Architecture• Common register file or private registers
FEATURES
• Word Width:Most RISC processors use a 32-bit internal and external word width
• Split or Common Cache:Cache is needed between RISC processor and main memory
FEATURES cont.
• On-Chip or Off-Chip Cache:Different chip designs either increase access time or simplify the design of the integer unit
Example – SPARC chip
•Write Buffer:Accessing data faster
FEATURES cont.
• Prefetch Buffer:Accessing instruction cache faster
•Harvard or Princeton Architecture:The design to access data and instruction cache
Examples:Motorola 88000MIPS R3000
FEATURES cont.
• Common Register File or Private Registers
Common Register – can be accessed by all execution units
Private Registers – works with the execution units
Examples:Motorola 88000IBM RS/6000
RISC VS. CISCMultiplication Example
Let’s find the product of two numbers. One in address location 2:3, and the other in 5:2, and then store it back into 2:3
RISC VS. CISC cont.
CISC Approach:
MULT 2:3, 5:2
Higher Level Language:
int a, b;
a = a * b;
RISC VS. CISC cont.
RISC APPROACH:
LOAD A, 2:3LOAD B, 5:2PROD A, BSTORE 2:3, A
RISC VS. CISC cont.ADVANTAGES
CISCEmphasis on hardware
Includes multi-clock
complex instructions
Memory-to-memory:
“LOAD” and “STORE”
incorporated in instructions
Small code sizes,
high cycles per second
Transistors used for storing
complex instructions
RISCEmphasis on software
Single-clock,
Reduced instruction only
Register to register:
“LOAD” and “STORE”
are independent instructions
Low cycles per second,
large code sizes
Spends more transistors
on memory registers
RISC VS. CISC cont.
CISC APPROACH ANALYSIS: attempts to minimize the number of instructions per program, sacrificing the number of cycles per instruction
RISC APPROACH ANALYSIS: RISC does the opposite, reducing the cycles per instruction at the cost of the number of instructions per program.
The Performance Equation
PA-RISC Architecture from HP
What is PA-RISC?
•Definition: PA-RISC stands for Precision Architecture, Reduced Instruction Set Computer
•Hewlett-Packard was the first computer company to replace their entire CISC machine families with RISC machines.
•Current versions run on the MPE/IX and HP-UX operating systems.
PA-RISC HP cont.
Some Features:
• Machine Instruction Formats
• Registers
• Delayed Branching
• Multiply Instruction
Machine Instruction Format
PA-RISC machines use an instruction set based on 32-bit general-purpose registers
Assembler Code Explanation
LDIL$2000,8 R8:=$2000
LDO32(8),8 R8:=R8 + 32
LDW-12(0,5),21 R21:=memory(R5-12)
STH8,0, (0,21) memory(R21):=R8
LDH-8(0,5),22 R22:=memory(R5-8)
EXTRS22,31,16,22 R22:=sign-extended(R22)
LDO-1(22),22 R22:=R22 - 1
LDW d(s,b),t
STH r,d(s,b)
LD d(b),t
LDIL i,t
EXTRS r,p,len,t
d = displacement
t = target register
i = immediate value
s = space id
r = source register
p = bit position
b = base register
(s,b) = memory address
len = number of
Machine Instruction Format cont.
Arithmetic: ADD@and SUB@.
Branches: B@ as in BL Branch and Link, BV Branch Vectored.
Compare and Branch: C@ as in COMIBF, COMpare Immediate and Branch If False.
Extract: EXTRS for signed and EXTRU for unsigned.
Load: L@ as in LDH load halfword, LDO load offset.
Shift: SH@ as in SH2ADD Shift 2 and Add.
Store: ST@ as in STB Store Byte, STW Store Word.
Registers
The PA-RISC uses 32 general purpose registers
R0 = bit bucket and source of zero value
R1 = target of ADDIL (Add Immediate Literal)
R2 = RP Return Pointer where BL places address and where BV gets it
R23 = fourth parameter of a procedure call
R24 = third parameter of a procedure call
R25 = second parameter of a procedure call
R26 = first parameter of a procedure call
R27 = DP Data Pointer to base of global data
R28-29 = function result in R28 if 32-bits, both if 64-bits
R30 = SP Stack Pointer to parameters and exit data
R31 = receives target branch address in BLE instruction
Registers cont.The PA-RISC systems also have in addition eight 32-bit Space Registers
SR 0 = return address of inter-space procedure calls
SR 1 = Temporary use for constructing long pointers
SR 2 = Temporary use for constructing long pointers
SR 3 = Temporary use for constructing long pointers
SR 4 = Code space
SR 5 = process private data: stack and heap
SR 6 = Shared data
SR 7 = System public code, literals, and data
Delayed Branching
• The ideal goal for the PA-RISC architecture is to complete the execution of a useful instruction in each machine cycle. The branch instruction is hard to implement in one cycle.
• Pipelining is used to execute instructions simultaneously, but doing a branch will not work with pipelining.
Problems with Branching
Delayed Branching cont.Solution
• Delay the execution of the branch for one cycle
• Make instructions following after branch ( located in a delay slot) be executed before control passes to the branch destination.
• Let the compiler look for an instruction to put in the delay slot, one that can be executed during the branch operation
Delayed Branching cont.
Example:
BL opencarton ; branch
LDW 26 ... ; load word into register during delay
BL closecarton ; branch
NOP ; code 8000240, actually OR 0,0,0
Delayed Branching cont.Delayed Branching is the same as if we could pack our bags while flying to our
destination:
1. book our flight
2. reserve hotel room
3. reserve rental car
4. fly to destination
5. (pack suitcase during the delay slot)
6. collect baggage
7. get rental car
8. check into hotel
Multiplying Instruction
• Integer multiply and divide not supported by hardware on a PA-RISC
• Find a way to optimize the frequent use of the constants used during multiply tasks during compile time
• Multiply can be converted to a series of additions and smaller multiples
Example:
120 = 10 x 12
120 = (5 x 12) + (5 x 12)
120 = ((4 x 12) + 12) + ((4 x 12) + 12)
Multiplying Instruction cont.Solution
• Use Shift and Add machine instructions to multiply a register by 2, 4, or 8 and add to any register in one cycle.
SH2ADD x,x,x; shift x 2 bits (multiply by 4), add to x, store in x
ADD x,x,x; add register x to itself and store in x
• The compiler can convert a multiplication by a constant into a series of Shift and Add instructions
RISC Systems and Processors
• PA-RISC Systems - http://www.testdrive.hp.com/systems/pa-risc.shtml
• RISC Processor - http://www.atmel.com/products/avr/
• MIPS Technology - http://www.mips.com/
References
1. http://www.robelle.com/library/smugbook/pa-risc.html
2. http://cse.stanford.edu/class/sophomore-college/projects-00/risc/whatis/index.html
3. http://www.inf.fu-berlin.de/lehre/WS94/RA/RISC-9.html
4. Anthony J. Dos Reis, Assembly Language and Computer Architecture Using C++ and Java, (United States: Course Technology, Copyright 2004).