50
1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Embed Size (px)

Citation preview

Page 1: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

1

CSC 3210Computer Organization and Programming

Chapter 7

SUBROUTINES

D.M. Rasanjalee Himali

Page 2: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Outline

Introduction Open Subroutines Register Saving Subroutine Linkage Arguments to Subroutines Examples Leaf Subroutines Pointers as Arguments to Subroutines

Page 3: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Introduction In programming there is frequently a need either

to repeat a computation or to repeat the computation with different arguments.

It is possible to repeat a computation by means of a subroutine.

Subroutines may be either open or closed.

Page 4: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Introduction An open subroutine

is handled by the text editor or by the macro preprocessor and is the insertion of the required code whenever it is needed in the

program.

A closed subroutine is one in which the code appears only once in the program; whenever it is needed, a jump to the code is executed, and when it completes, a return is made to the instruction occurring

after the jump instruction.

Arguments to closed subroutines may be placed in registers or on the stack.

Page 5: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Introduction Execution of the subroutine should not change

the state of the machine, except possibly for the condition codes.

i.e. any registers that the subroutine uses must first be saved and then restored after the subroutine completes execution.

Arguments to subroutines are normally local variables of the subroutine, and generally, the subroutine is free to change them.

Page 6: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Register Saving Almost any computation will involve the use of registers.

The SPARC architecture provides for a register file with a mapping register that indicates the active registers.

Typically, 128 registers are provided, with the programmer having access to the eight global registers, and only 24 of the mapped registers at any one time.

The save instruction changes the register mapping so that new registers are provided.

A similar instruction, restore, restores the register mapping on subroutine return.

Page 7: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

S1(){ save %sp, -96, %sp ---- ---- S2() ---- ---- restore}

S2(){ save %sp, -96, %sp ---- ---- ---- ---- restore}

1. Reserve new 24 registers(8-in | 8-local | 8- 0ut)

2. Reserve Stack memory(96 bytes in this case)

+ 8 global registers common to all subroutines

8-Global

8*16 =128 registers

REGISTER FILE

8 registers

Register set=16 registers

1. Restore/Release reserved registers

2. Release Stack memory(96 bytes in this case)

Page 8: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

S1(){ save %sp, -96, %sp ---- ---- S2() ---- ---- restore}

S2(){ save %sp, -64, %sp ---- ---- ---- ---- restore}

8-Global

8*16 =128 registers

MEMORY REGISTER FILE

%sp

BEFORE EXECUTION

Page 9: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

S1(){ save %sp, -96, %sp ---- ---- S2() ---- ---- restore}

S2(){ save %sp, -64, %sp ---- ---- ---- ---- restore}

8-Global

8*16 =128 registers

MEMORY REGISTER FILE

96 bytes

%fp

%sp

CWP

EXECUTION

Page 10: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

S1(){ save %sp, -96, %sp ---- ---- S2() ---- ---- restore}

S2(){ save %sp, -64, %sp ---- ---- ---- ---- restore}

8-Global

8*16 =128 registers

MEMORY REGISTER FILE

96 bytes

%fp

%sp

CWP

64 bytes

EXECUTION

Page 11: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

S1(){ save %sp, -96, %sp ---- ---- S2() ---- ---- restore}

S2(){ save %sp, -64, %sp ---- ---- ---- ---- restore}

8-Global

8*16 =128 registers

MEMORY REGISTER FILE

96 bytes

%fp

%sp

CWP

EXECUTION

Page 12: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

S1(){ save %sp, -96, %sp ---- ---- S2() ---- ---- restore}

S2(){ save %sp, -64, %sp ---- ---- ---- ---- restore}

8-Global

8*16 =128 registers

MEMORY REGISTER FILE

%sp

EXECUTION

Page 13: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Register Saving The 32 registers are divided into four groups:

in, local, out, and general

The eight general registers, %g0—%g7, are NOT mapped and are global to all subroutines.

The in registers are used to pass arguments to closed subroutines,

The local registers are for a subroutine’s local variables,

The out registers are used to pass arguments to subroutines that are called by the current subroutine.

The in, local, and out registers are mapped.

Page 14: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Register Saving When the save

instruction is executed the out registers

become the in registers, and

a new set of local and out registers is provided.

The mapping pointer into the register file is changed by 16 registers

Page 15: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

8-Global

REGISTER FILE

8-Global

REGISTER FILE

Page 16: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Register Saving The current register set is indicated by the current window pointer,

“CWP,” a machine register.

The last free register set is marked by the window invalid bit, in the “WIM,” another machine register.

Each register set contains 16 general registers; the number of register sets is implementation dependent.

There are really 8 x 16 hardware registers and that the set selected is controlled by the cwp.

When the save instruction is executed, the prior subroutine’s register contents remain unchanged until a restore instruction is executed, resetting the cwp.

Page 17: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Register Saving

Page 18: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Register Saving If a further five

subroutine calls are made without any returns, the situation in Figure 7.3 exists.

The out registers being used are from the invalid register window marked by the win bit.

After further 5 subroutine calls without return

After further 6 subroutine calls without return (hardware trap)

One additional subroutine call

Page 19: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Register Saving If more than 5 additional subroutine call is made,

a hardware trap occurs.

Its effect is to move the 16 registers from window set seven onto the stack where the stack pointer of register window seven is pointing.

The trap handler may use the local registers of the invalid window.

The cwp and wim pointers are moved as shown in Figure 7.2.

Page 20: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Register Saving Register window mapping explains the process by which

the stack pointer becomes the frame pointer.

The stack pointer is register %o6, which, after a save, becomes %i6 the frame pointer.

Page 21: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Register Saving The save and restore instructions are both also add

instructions. However, the source registers are always from the current register set,

and the destination register is always in the new register set.

Thus the following instruction subtracts 64 from the current stack pointer but stores the result into the new stack pointer, leaving the old stack pointer contents unchanged.

After the save instruction is executed, the old, unchanged stack pointer becomes the new frame pointer.

=(4bytes per register * 16 registers per register set)

Page 22: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Register Saving The restore instruction, restores the register window set.

On doing this a register window can underflow if the cwp is moved to the wim.

When this happens the window trap routine restores the registers from the stack and resets the pointers.

The restore instruction is also an add instruction and is frequently used as the final add instruction in a subroutine

Page 23: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Subroutine Linkage To branch to the first instruction of a subroutine, a ba instruction

might be used.

Unfortunately, if it is used there is no way of returning to the point where the sub routine was called.

The SPARC architecture supports two instructions for linking to subroutines. :

jmpl and call

Both instructions may be used to store the address of the instruction that called the subroutine into register %o7.

Question : What is the return address of the subroutine with no save instruction executed at the beginning?

Question: What is the return address of the subroutine with a save instruction executed at the beginning?

Page 24: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Subroutine Linkage As the instruction following the instruction that

called the subroutine will also be executed, the return from a subroutine is to %o7 + 8, which is the address of the next instruction to be executed in the main program.

If a save instruction is executed at the beginning of the subroutine, the contents of %o7 will become the contents of %i7 and the return will have to be to %i7 + 8.

Page 25: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Subroutine Linkage If the subroutine name is known at assembly time, the call

instruction may be used to link to a subroutine.

The call instruction has as operand the label at the entry to the subroutine and transfers control to that address.

It also stores the current value of the program counter, %pc, into %o7.

Like any instruction that changes the %pc, the call instruction is always followed by a delay slot instruction. The call instruction delay instruction may not be annulled.

Page 26: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Subroutine Linkage If the address of the subroutine is computed, it must be loaded into a

register.

If this is done, the jmpl instruction is used to call the subroutine.

Like most other instructions, the jmpl instruction has two source arguments and a destination register.

The source may be a register and a constant or two registers.

The address of the subroutine is the sum of the register contents or the sum of the register and the constant.

It is this address to which the transfer takes place.

Page 27: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Subroutine Linkage Like all branching instructions, jmpl is followed by a delay slot instruction.

The address of the jmpl instruction is stored in the destination register.

Thus, to call a subroutine whose address is in register %oO storing the return address into %o7, we would write:

The assembler recognizes

as

You may use the call for both types of subroutine calls.

Destination address(Called Subroutine)

Return address(Calling Subroutine)

Page 28: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Subroutine Linkage The return from a subroutine also makes use of the jmpl instruction.

In this case we need to return to %i7 + 8 and the assembler recognizes the mnemonic ret for:

The call to a subroutine is then:

At the entry of the subroutine:

with the return:

Save instruction in called subroutinecauses %o6 in calling subroutine tomap to %i6 in called subroutine

Page 29: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Subroutine Linkage

The ret instruction is expanded by the assembler to

The restore instruction: is normally used to fill the delay slot of the ret instruction. Restore the register window set Can be final add instruction in a subroutine

Page 30: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Arguments to Subroutines

Arguments to subroutines can follow in-line after the call instruction, be on the stack, or be located in registers.

Page 31: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Arguments to Subroutines 1. Arguments follow in-line after the call instruction:

For example, a Fortran routine to add two numbers, 3 and 4, together would be called by:

and handled by the following subroutine code:

Note that the return is to %i7 + 16 jumping over the arguments.

This type of argument passing is very efficient but is limited. Recursive calls are not possible, nor is it possible to compute any of the arguments.

Page 32: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Arguments to Subroutines 2. Arguments placed in stack:

Placing argument onto the stack is, very general but time consuming. Each argument must be stored on the stack before the subroutine may be

called. allows us complete flexibility to compute arguments, pass any number of

arguments, and support recursive calls.

Page 33: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Arguments to Subroutines 3. Arguments placed in in-registers:

We can use registers %o0 to %o5 (6 registers) to pass on six values to the new subroutine ( where they will be stored in registers %i0 to %i5).

But for more arguments than that, they have to be stored on the stack. Hence the save command at the start of the function will have to be modified accordingly.

After execution of a save instruction the arguments will be in the first six in registers, %i0—%i5.

Page 34: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Arguments to Subroutines The convention established in the SPARC architecture is to pass the

first six arguments in the first six out registers, %o0—%o5, with any additional arguments placed on the stack.

However, space is always reserved for the first six arguments on the stack even though they are not there (similar to reserved space for register saving).

In fact, the space is reserved even if there are no arguments at all.

Each argument occupies ONE WORD on the stack or register, so that when passing byte arguments to subroutines, they must be moved into word quantities before passing.

Page 35: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Arguments to Subroutines The arguments are located on the stack, after the 64 bytes reserved

for register window saving.

However, immediately after the 64 bytes reserved for register window saving, there is a pointer to where a structure may be returned (this is discussed in Section 7.7).

Thus, the structure return pointer will be at %sp + 64 and the first argument, if it were on the stack, at %sp + 68.

Page 36: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Arguments to Subroutines As we have seen in the previous examples, 64 bytes are reserved

on the stack for register window saving.

Further 4 bytes are now needed for a pointer to an address where a structure may be returned by the function.

After that, 24 bytes are reserved by convention for the first six arguments.

After that, more space can be reserved for local variables on the stack. The typical save command will now have to be modified as:

=92 bytes

Page 37: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Arguments to Subroutines The save instruction provides:

Space for saving the register window set, if necessary A Structure Pointer A place to save 6 arguments Space for any local variables

While keeping the %sp aligned in a double-word boundary.

Page 38: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Arguments to Subroutines If we had a subroutine vector with local variables:

then the save instruction would be:

resulting in 104 bytes being subtracted from the stack pointer.

Page 39: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Arguments to Subroutines Structure pointer and space to save the called routine’s

arguments are all accessed positively with respect to the stack pointer

The subroutine’s arguments are located positively with respect to the frame pointer.

Local variables are accessed negatively with respect to the frame pointer.

Page 40: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Arguments to Subroutines The argument offsets are logically defined as:

Notice the positive offsets! (w.r.t. %sp)

define(arg1_s, 68)define(arg2_s, 72)define(arg3_s, 76)define(arg4_s, 80)define(arg5_s, 84)define(arg6_s, 88)

Page 41: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Example – Called Subroutine Let us look at an example. We will express the algorithm in C as

follows :

Page 42: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

!incoming arguments

define(a_r, i0)

define(b_r, i1)

define(c_r, i2)

!automatic variables

define(x_s,-4)

define(y_s,-8)

define(ary_s,-264)

!register variables

define(i_r, l0)

define(j_r, l1)

.global example_function

example:

save %sp, (-92+-264)&-8, %sp

add %a_r,%b_r,%o0 !x=a+b

st %o0, [%fp+x_s]

add %c_r,64,%o0 !i=c+64

add %a_r,%c_r,%o0 !ary[i] =c+a

sll %i_r, 1, %o1

add %fp, ary_s, %o2

sth %o0, [%o1 + %o2]

ld [%fp+x_s], %o0 !y = x*a

call .mul

mov %a_r, %o1

st %o0, [%fp+y_s]

ld [%fp+x_s], %o0 !j = x+i

add %i_r. %o0, %j_r

ld [%fp+x_s], %o0 !return x+y

ld [%fp+y_s], %o1

end_example:

ret

restore %o0, %o1, %o0 !result in o0

……

Page 43: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Return Values Subroutines that return a value are called functions.

A function in C and C++ can also return a structure.

The value returned by a function or subroutine is always returned in register %o0 of the calling program.

If a save instruction has been executed in called function, %o0 will be %i0 before the restore instruction is executed.

Page 44: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Subroutines with Many Arguments Arguments beyond the sixth are passed on the stack.

In this case we must first make room for the arguments by subtracting from the stack pointer.

For example, to call a subroutine with eight arguments:

which returns the sum:

We first have to make room for arguments seven and eight, which will go on the stack making sure that the stack is still double word aligned.

Page 45: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Subroutines with Many Arguments Calling Subroutine:

The seventh and eighth arguments will go onto the stack at %sp + 92 and at %sp + 96, respectively. We can then pass the arguments as follows:

Notice the positive offsets of additional arguments w.r.t %sp

! Make space for two additional args on stack for fooadd %sp,-2*4 &-8,%sp

! Load additional args to stackmov 7,%o0 !load arg 7 with its valuest %o0,[%sp+92]

mov 8,%o0 !load arg 8 with its valuest %o0,[%sp+96]

! Load first 6 args going to in registersmov 6, %o5mov 5, %o4mov 4, %o3mov 3, %o2mov 2, %o1mov 1, %o0

! Call foo subroutinecall foonop

! Release space on stack reserved for additional argssub %sp, -2*4&-8,%sp

Calling sub(){

----foo(1,2,3,4,5,6,7,8)----

}

Two additional arguments

Page 46: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Subroutines with Many Arguments

Called Subroutine:

Inside foo the arguments may be accessed by:

Notice the positive offsets of additional arguments w.r.t %fp

!define incoming argument offsetsdefine(a1_r, i0)define(a2_r, i1)define(a3_r, i2)define(a4_r, i3)define(a5_r, i4)define(a6_r, i5)define(a7_s,92)define(a8_s,96)

.global foofoo:

save %sp,-96,%sp

ld [%fp+a8_s],%o0 !8th argumentld [%fp+a7_s],%o1 !7th argumentadd %o0, %o1, %o0add %a6_r,%o0,%o0 !6th argumentadd %a5_r,%o0,%o0 !5th argumentadd %a4_r,%o0,%o0 !4th argumentadd %a3_r,%o0,%o0 !3rd argumentadd %a2_r,%o0,%o0 !2nd argument

retrestore %a1_r,%o0,%o0 !1st argument

foo(int: a1,a2,a3,a4,a5,a6,a7,a8){

return a1+a2+a3+a4+a5+a6+a7+a8

}

Page 47: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

In Registers –calling sub

MEMORY REGISTER FILE

92 bytes

Before calling foo at caling sub: (but after placing arguments in registers and memory)

7

8

?

%fp

%sp

1

2

3

local Registers –calling sub

out Registers –calling sub

4

56

%sp+92

%sp+96

In Registers –calling sub

MEMORY REGISTER FILE

92 bytes

After calling foo at caling sub: (but before foo returns)

7

8

?

%fp

1

2

3

local Registers –calling sub

in Registers –foo sub

4

56

%fp+92

%fp+96

92 bytes

?

%sp

local Registers –foo subout Registers –foo sub

Page 48: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Leaf Subroutines A leaf routine is one that does not call any other routines.

For a leaf routine the register usage is restricted as follows: The leaf routine may only use the first six out registers and the global

registers %go and %g1.

A leaf routine does not execute either a save or a restore instruction but simply uses the calling subroutine’s register set, observing the restrictions listed above.

The elimination of register saving and restoring makes calling a leaf routine very efficient.

The .mul routine is a leaf routine.

Page 49: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Leaf Subroutines A leaf routine is called in the same manner as a regular subroutine,

placing the return address into %o7.

As a save instruction is not executed, the return address for a leaf routine is %o7 + 8, not %i7 +8.

To return from a leaf subroutine, we use retl statement

The assembler recognizes retl for:

Page 50: 1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali

Leaf Subroutines The subroutine foo

should have been written as a leaf routine as follows:

!define incoming argument offsetsdefine(a1_r, o0)define(a2_r, o1)define(a3_r, o2)define(a4_r, o3)define(a5_r, o4)define(a6_r, o5)define(a7_s,92)define(a8_s,96)

.global foofoo:

add %a2_r,%a1_r,%o0 !o0 = 1st + 2nd add %a3_r,%o0,%o0 !o0 += 3rd add %a4_r,%o0,%o0 !o0 += 4th add %a5_r,%o0,%o0 !o0 += 5th add %a6_r,%o0,%o0 !o0 += 6th ld [%sp+a7_s], %o1add %o1, %o0, %o0 !o0 += 7th ld [%sp+a8_s], %o1add %o1, %o0, %o0 !o0 += 8th

end_foo:retlnop