Upload
cid
View
39
Download
0
Embed Size (px)
DESCRIPTION
CSC 3210 Computer Organization and Programming. Chapter 2 SPARC Architecture Dr. Anu Bourgeois. Introduction. SPARC is a load/store architecture Registers used for all arithmetic and logical operations 32 registers available at a time - PowerPoint PPT Presentation
Citation preview
CSC 3210Computer Organization and
Programming
Chapter 2
SPARC Architecture
Dr. Anu Bourgeois1
Introduction
• SPARC is a load/store architecture
• Registers used for all arithmetic and logical operations
• 32 registers available at a time
• Uses only load and store instructions to access memory
2
Registers
• Registers are accessed directly for rapid computation
• 32 registers – divided into 4 sets-- Global: %g0-%g7 -- Out: %o0 - %o7
-- In: %i0 - %i7 -- Local: %l0 - %l7
• %g0 – always returns 0
• %o6, %o7, %i6, %i7 – do not use
• Register size = 32 bits each
3
Table of RegistersGlobal registers Out registers Local registers Out registers
Register Synonym Register Synonym Register Synonym Register Synonym
%g0* %r0 %o0 %r8 %l0 %r16 %i0 %r24
%g1 %r1 %o1 %r9 %l1 %r17 %i1 %r25
%g2 %r2 %o2 %r10 %l2 %r18 %i2 %r26
%g3 %r3 %o3 %r11 %l3 %r19 %i3 %r27
%g4 %r4 %o4 %r12 %l4 %r20 %i4 %r28
%g5 %r5 %o5 %r13 %l5 %r21 %i5 %r29
%g6 %r6 %o6 %r14,%sp
%l6 %r22 %i6, %fp
%r30
%g7 %r7 %o7# %r15 %l7 %r23 %i7^ %r31
4
* -- Always discards writes and returns zero# -- Called subroutine return address^ -- Subroutine return address
SPARC Assembler
• SPARC assembler as: 2-pass assembler
• First pass: – Updates location counter without paying
attention to undefined labels for operands– Defines label symbol to location counter
• Second pass:– Values substituted in for labels– Ignores labels followed by colons
5
Assembly Language Programs• Programs are line based• Use mnemonics which generate machine
code upon assembling• Statements may be labeled• Comments: ! or /* … */
/* instructions to add and to subtract the contents of %o0 and %o1 */
start: add %o0, %o1, %l0 !l0=o0+o1
sub %o0, %o1, %l1 !l1=o0-o1
6
Psuedo-ops
• Statements that do not generate machine code– e.g. Data defininitions, statements to provide the
assembler information
• Generally start with a period
a: .word 3
• Can be labeled
.global main
main:
7
Compiling Code – 2 step process• C compiler will call as and produce
the object files
• Object files are the machine code
• Next calls the linker to combine .o files with library routines to produce the executable program – a.out
8
Compiling a C program
%gcc -S program.c : produces the .s assembly language file
%gcc expr.s –o expr : assembles the program and produces the executable file
NOTE: You will only do this for the 1st assignment
9
Start of Execution
• C compiler expects to start execution at an address main
• The label must be at the first statement to execute and declared to be global
.global main main: save %sp, -96, %sp
• save instruction provides space to save registers for the debugger
10
Macros
• If we have macros defined, then the program should be a .m file
• We can expand the macros to produce a .s file by running m4 first
% m4 expr.m > expr.s
% gcc expr.s –o expr
11
SPARC Instructions
• 3 operands: 2 source operands and 1 destination operand
• Source registers are unchanged
• Result stored in destination register
• Constants : -4096 ≤ c < 4096op regrs1, regrs2, regrd
op regrs1, imm, regrd
12
Sample Instructions
clr regrd
Clears a register to zero
mov reg_or_imm, regrd
Copies content of source to destination
add regrs1, reg_or_imm, regrd
Adds oper1 + oper2 destination
sub regrs1, reg_or_imm, regrd
Subtracts oper1 - oper2 destination13
Multiply and Divide
• No instruction available in SPARC
• Use function call instead
• Must use %o0 and %o1 for sources and %o0 holds result
mov b, %o0 mov b, %o0
mov c, %o1 mov c, %o1
call .mul call .div
a = b * c a = b ÷ c14
Instruction Cycle
• Instruction cycle broken into 4 stages:Instruction fetch Fetch & decode instruction, obtain any
operands, update PC
Execute Execute arithmetic instruction, compute branch target address, compute memory address
Memory access Access memory for load or store instruction; fetch instruction at target
of branch instruction
Store results Write instruction results back to register file
15
Pipelining
• SPARC is a RISC machine – want to complete one instruction per cycle
• Overlap stages of different instructions to achieve parallel execution
• Can obtain a speedup by a factor of 4
• Hardware does not have to run 4 times faster – break h/w into 4 parts to run concurrently
16
Pipelining• Sequential: each h/w stage idle 75% of the time.
timeex = 4 * i
• Parallel: each h/w stage working after filling the pipeline. timeex = 3 + i
17
Data Dependencies – Load Delay Problem
load [%o0], %o1
add %o1, %o2, %o2
18
Branch Delay Problem• Branch target address not available until after
execution of branch instruction
• Insert branch delay slot instruction
19
Branch delays
• Try to place an instruction after the branch that is useful – can also use nop
• The instruction following a branch instruction will always be fetched
• Updating the PC determines which instruction to fetch next
20
cmp %l0, %l1
bg next
mov %l2, %l3
sub %l3, 20, %l4
Condition true:
branch to next
Condition false:
continue to sub
cmp bg mov
???
bg
execute
mov
fetch
21
F E M W
F E M W
F E M W
F E M W
Determine if branch
taken
Update if true
Target PC
Fetch instruction from memory[PC]
Update PCPC++
Obtain operands
Actual SPARC Code: expr.m
22
Expanding Macros• After running through m4: %m4 expr.m > expr.s
• Produce executable: %gcc expr.s – expr
• Execute file: %./expr
23
The Debugger – gdb
• Used to verify correctness, and find bugs
• Can also execute a program, stop execution at any point and single-step execution
• After assembling the program and placing the output into expr, launch gdb: %gdb expr
• To run code in gdb, type “r”:
(gdb) r
24
gdb Commands• Can be set at any address to stop execution in order to check
status of program and registers• To set a breakpoint at a label:
(gdb) b mainBreakpoint 1 at 0x106a8(gdb)
• Typing “c” continues execution until it reaches the next breakpoint or end of code
• Can print contents of a register
(gdb) p $l1$2 = -8(gdb)
• Best way to learn is by practice
25
Filling Delay Slots
• The call instruction is called a delayed control transfer instruction : changes address from where future instructions will be fetched
• The following instruction is called a delayed instruction, and is located in the delay slot
• The delayed instruction is executed before the branch/call happens
• By using a nop for the delay slot – still wasting a cycle
• Instead, we may be able to move the instruction prior to the branch instruction into the delay slot.
26
Filling Delay Slots
• Move sub instructions to the delay slots to eliminate nop instructions .global main
main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call .mul sub %l0, 7, %o1 !(x - 7) into %o1
call .div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor
mov %o0, %l1 !store it in y
ret ! end the program restore
27
Filling Delay Slots
• Executing the mov instruction, while fetching the sub instruction .global main
main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call .mul sub %l0, 7, %o1 !(x - 7) into %o1
call .div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor
mov %o0, %l1 !store it in y
ret ! end the program restore
28
EXECUTE FETCH
Filling Delay Slots
• Now executing the sub instruction, while fetching the call instruction .global main
main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call .mul sub %l0, 7, %o1 !(x - 7) into %o1
call .div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor
mov %o0, %l1 !store it in y
ret ! end the program restore
29
EXECUTE FETCH
Filling Delay Slots• Now executing the call instruction, while fetching the sub instruction .global main
main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call .mul sub %l0, 7, %o1 !(x - 7) into %o1
call .div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor
mov %o0, %l1 !store it in y
ret ! end the program restore
• Execution of call will update the PC to fetch from mul routine, but since sub was already fetched, it will be executed before any instruction from the mul routine
30
EXECUTE FETCH
Filling Delay Slots• Now executing the sub instruction, while fetching from the mul
routine .global main
main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call .mul sub %l0, 7, %o1 !(x - 7) into %o1 call .div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor mov %o0, %l1 !store it in y
ret ! end the program restore ……
.mul:save …..……
31
EXECUTE
FETCH
Filling Delay Slots• Now executing the save instruction, while fetching the next instruction
from the mul routine .global main
main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call .mul sub %l0, 7, %o1 !(x - 7) into %o1 call .div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor mov %o0, %l1 !store it in y
ret ! end the program restore ……
.mul:save …..……
32
EXECUTE FETCH
Filling Delay Slots• While executing the last instruction of the mul routine, will come back to
main and fetch the call .div instruction
.global main main:
save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call .mul sub %l0, 7, %o1 !(x - 7) into %o1
call .div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor
mov %o0, %l1 !store it in y
ret ! end the program restore ……
.mul:save …..……
33EXECUTE
FETCH
At this point %o0 has the result from the multiply routine – this is the first operand for the divide routine
The subtract instruction will compute the 2nd operand before starting execution of the divide routine
34
2.9 BranchingInstructions for testing and branching:
2.9.1 Testing
The information about the state of execution of an instruction is saved in the following flags:Z zero whether the result was zeroN negative whether the result was negativeV overflow whether the result was too large for the registerC carry whether the result generated a carry out
Special add and sub instructions:‘cc’ is appended to the mnemonic, and the instruction sets condition codes Z, N, V, and C to save the state of execution.
E.g. addcc regrs1, reg_or_imm, regrd
subcc regrs1, reg_or_imm, regrd
35
2.9.2 Branches• Branch instructions are similar to call instructions.• They will specify the label of the destination instruction. • These too are delayed control transfer instructions.
Branch instructions test the condition codes in order t determine if the branching condition exists:
b_{icc} label
where bicc stands for one of the branches testing the integer condition codes.
36
Table of signed number branches
Assembler
Mnemonic
Unconditional
Branches
ba Branch always, goto
bn Branch never
Assembler
Mnemonic
Signed Arithmetic
Branches
bl Branch on less than zero
ble Branch on less or equal to zero
be Branch on equal to zero
bne Branch on not equal to zero
bge Branch on greater or equal to zero
bg Branch on greater than zero
37
38
39
40
41
42
2.10 Control statements
2.10.1 While :The condition of a while loop is to be evaluated before the loop is executed, and if the condition is not met, the loop, including the first instruction of the loop, is not to be executed.Consider the C equivalent of the while loop:
While ( a <= 17){
a = a += b;c++;
}
43
44
45
46
47
Annulled Conditional Branches:
-Branch is taken if condition is true, otherwise, if condition is false, then branch is annulled-Delay slot is still fetched in either case, but the execution is what is annulled, causing a wasted cycle when false
48
2.10.2 Do
Consider a Do loop:
49
50
2.10.3 For
For structure in C:
For ( ex1; ex2;, ex3 ) st
Express the above definition as:
ex1;
While ( ex2 ) {
st
ex3
}
51
Thus the translation of for (a=1; a<= b; a++)
c *= a;would be:
52
2.10.4 If Then
The statement following the relational expression is to be branched over if the condition is not true. To accomplish this, we need to logically complement the sense of the branch, following the relational expression evaluation, before the code for the statement.
Table of complements of the branches
Condition Complement
bl bge
ble bg
be bne
bne be
bge bl
bg ble
53
For example, to translate
54
55
2.10.5 If Else
An if-else statement allows us to do a letter with regard to filling the delay slot.
Consider:
If ((a+b) >= c) {
a += b;
c++;
} else {
a -= b;
C--;
}
C += 10;
56
We will complement initial test to branch over and then code to the else code if the condition is false.
57
58