Upload
annis-ryan
View
218
Download
2
Tags:
Embed Size (px)
Citation preview
Computer Architecture and Design – ECEN 350
Part 4
[Some slides adapted from M. Irwin, D. Paterson and others]
Compiler
Transforms the C program into an assembly language program
Advantages of high-level languages much fewer lines of code easier to understand and debug
Today’s optimizing compilers can produce assembly code nearly as good as an assembly language programming expert and often better for large programs smaller code size, faster execution
Assembler
Transforms symbolic assembler code into object (machine) code
Advantages of assembler much easier than remembering instr’s binary codes can use labels for addresses – and let the assembler
do the arithmetic can use pseudo-instructions
- e.g., “move $t0, $t1” exists only in assembler (would be implemented using “add $t0,$t1,$zero”)
When considering performance, you should count instructions executed, not code size
The Two Main Tasks of the Assembler
Finds the memory locations with labels so the relationship between the symbolic names and their addresses is known Symbol table – holds labels and their
corresponding addresses- A label is local if the object is used only within the file
where its defined. Labels are local by default.- A label is external (global) if it refers to code or data in
another file or if it is referenced from another file. Global labels must be explicitly declared global (e.g., .globl main)
Translates each assembly language statement by combining the numeric equivalent of the opcodes, register specifiers, and labels
Example: C Asm Obj Exe Run
#include <stdio.h>int main (int argc, char *argv[]) { int i; int sum = 0;
for (i = 0; i <= 100; i = i + 1) sum = sum + i * i;
printf ("The sum from 0 .. 100 is %d\n", sum);}
Example: C Asm Obj Exe Run
.text.align 2.globl main
main:subu $sp,$sp,40sw $ra, 20($sp)sd $a0, 32($sp)sw $0, 24($sp)sw $0, 28($sp)
loop:lw $t6, 28($sp)mul $t7, $t6,$t6lw $t8, 24($sp)addu $t9,$t8,$t7sw $t9, 24($sp)
addu $t0, $t6, 1sw $t0, 28($sp)ble $t0,100, loopla $a0, strlw $a1, 24($sp)jal printfmove $v0, $0lw $ra, 20($sp)addiu $sp,$sp,40j $ra.data.align 0
str: .asciiz "The sum from 0 .. 100 is %d\n"
Example: C Asm Obj Exe Run
00 addiu $29,$29,-4004 sw $31,20($29)08 sw $4, 32($29)0c sw $5, 36($29)10 sw $0, 24($29)14 sw $0, 28($29)18 lw $14, 28($29)1c multu $14, $1420 mflo $1524 lw $24, 24($29)28 addu $25,$24,$152c sw $25, 24($29)
30 addiu $8,$14, 134 sw $8,28($29)38 slti $1,$8, 101 3c bne $1,$0, loop40 lui $4, l.str44 ori $4,$4,r.str 48 lw $5,24($29)4c jal printf50 add $2, $0, $054 lw $31,20($29) 58 addiu $29,$29,405c jr $31
• Remove pseudoinstructions, assign addresses
Symbol Table Entries
Symbol Table Label Addressmain: 0x00000000loop: 0x00000018str: 0x10000430printf: 0x000003b0
Relocation InformationAddress Instr. Type Dependency 0x00000040 lui l.str 0x00000044 ori r.str 0x0000004c jal printf
Example: C Asm Obj Exe Run
00 addiu $29,$29,-4004 sw $31,20($29)08 sw $4, 32($29)0c sw $5, 36($29)10 sw $0, 24($29)14 sw $0, 28($29)18 lw $14, 28($29)1c multu $14, $1420 mflo $1524 lw $24, 24($29)28 addu $25,$24,$152c sw $25, 24($29)
30 addiu $8,$14, 134 sw $8,28($29)38 slti $1,$8, 101 3c bne $1,$0, -10 40 lui $4, 409644 ori $4,$4,1072 48 lw $5,24($29)
4c jal 812 50 add $2, $0, $054 lw $31,20($29) 58 addiu $29,$29,405c jr $31
• Edit Addresses: start at 0x0040000
Example: C Asm Obj Exe Run
0x004000 001001111011110111111111111000000x004004 101011111011111100000000000101000x004008 101011111010010000000000001000000x00400c 101011111010010100000000001001000x004010 101011111010000000000000000110000x004014 101011111010000000000000000111000x004018 100011111010111000000000000111000x00401c 100011111011100000000000000110000x004020 000000011100111000000000000110010x004024 001001011100100000000000000000010x004028 001010010000000100000000011001010x00402c 101011111010100000000000000111000x004030 000000000000000001111000000100100x004034 000000110000111111001000001000010x004038 000101000010000011111111111101110x00403c 101011111011100100000000000110000x004040 001111000000010000010000000000000x004044 100011111010010100000000000110000x004048 000011000001000000000000111011000x00404c 001001001000010000000100001100000x004050 100011111011111100000000000101000x004054 001001111011110100000000001000000x004058 000000111110000000000000000010000x00405c 00000000000000000001000000100001
Other Tasks of the Assembler
Converts pseudo-instr’s to legal assembly code register $at is reserved for the assembler to do this
Converts branches to far away locations into a branch followed by a jump
Converts instructions with large immediates into a lui followed by an ori
Converts numbers specified in decimal and hexidecimal into their binary equivalents and characters into their ASCII equivalents
Deals with data layout directives (e.g., .asciiz)Expands macros (frequently used sequences of
instructions)
MIPS (spim) Memory AllocationMemory
230
words
0000 0000
f f f f f f f c
TextSegment
Reserved
Static data
Mem Map I/O
0040 0000
1000 00001000 8000
7f f f f f fcStack
Dynamic data
$sp
$gp
PC
Kernel Code & Data
The Code Translation Hierarchy
C program
compiler
assembly code
assembler
object code library routines
executable
linker
machine code
Linker Takes all of the independently assembled code
segments and “stitches” (links) them together Faster to recompile and reassemble a patched segment, than it
is to recompile and reassemble the entire program
1. Decides on memory allocation pattern for the code and data modules of each segment Remember, segments were assembled in isolation so each has
assumed its code’s starting location is 0x0000 0000
2. Relocates absolute addresses to reflect the new starting location of the code segment and its data module
3. Uses the symbol tables information to resolve all remaining undefined labels branches, jumps, and data addresses to/in external segments
Linker produces an executable file
Linker Code Schematic
rt_1f: . . .
printf: . . .
main: jal ??? . . . jal ???
call, rt_1fcall, printf
Linker
Object file
Object file
C library
Relocation records
main: jal printf . . . jal rt_1fprintf: . . .rt_1f: . . .
Executable file
Linker
Step 1: Take text segment from each .o file and put them together.
Step 2: Take data segment from each .o file, put them together, and concatenate this onto end of text segments.
Step 3: Resolve References Go through Relocation Table and handle each entry That is, fill in all absolute addresses
Linking Two Object FilesH
dr
T
xtse
g
D
seg
R
eloc
S
mtb
l D
bg
File 1
Hdr
T
xtse
g
Dse
g
Rel
oc S
mtb
l D
bg
File 2
+
Executable
Hdr
T
xtse
g
Dse
g
R
eloc
Four Types of Addresses we will discuss
PC-Relative Addressing (beq, bne): never relocate
Absolute Address (j, jal): always relocateExternal Reference (usually jal): always
relocateData Reference (often lui and ori): always
relocate
Resolving References (1/2)
Linker assumes first word of first text segment is at address 0x00000000.
Linker knows: length of each text and data segment ordering of text and data segments
Linker calculates: absolute address of each label to be jumped to
(internal or external) and each piece of data being referenced
Resolving References (2/2)
To resolve references: search for reference (data or label) in all “user”
symbol tables if not found, search library files
(for example, for printf) once absolute address is determined, fill in the
machine code appropriatelyOutput of linker: executable file containing text
and data (plus header)
The Code Translation Hierarchy
C program
compiler
assembly code
assembler
object code library routines
executable
linker
loader
memory
machine code
Loader (1/3)
Input: Executable Code(e.g., a.out for MIPS)
Output: (program is run)Executable files are stored on disk.When one is run, loader’s job is to load it into
memory and start it running. In reality, loader is the operating system (OS)
loading is one of the OS tasks
Loader (2/3)
So what does a loader do?Reads executable file’s header to determine
size of text and data segmentsCreates new address space for program large
enough to hold text and data segments, along with a stack segment
Copies instructions and data from executable file into the new address space (this may be anywhere in memory as we will see later)
Loader (3/3)Copies arguments passed to the program onto
the stack Initializes machine registers
Most registers cleared, but stack pointer assigned address of 1st free stack location
Jumps to start-up routine that copies program’s arguments from stack to registers and sets the PC If main routine returns, start-up routine terminates
program with the exit system call In SPIM:
Jumps to a start-up routine (at PC addr 0x0040 0000) that copies the parameters into the argument registers and then calls the main routine of the program with a jal main
Things to Remember (1/3)
C program: foo.c
Assembly program: foo.s
Executable(mach lang pgm): a.out
Compiler
Assembler
Linker
Loader
Memory
Object(mach lang module): foo.o
lib.o
Things to Remember 3/3
Stored Program concept mean instructions just like data, so can take data from storage, and keep transforming it until load registers and jump to routine to begin executionCompiler Assembler Linker ( Loader)
Assembler does 2 passes to resolve addresses, handling internal forward references
Linker enables separate compilation, libraries that need not be compiled, and resolves remaining addresses
Things to Remember (2/3)
Compiler converts a single HLL file into a single assembly language file.
Assembler removes pseudoinstructions, converts what it can to machine language, and creates a checklist for the linker (relocation table). This changes each .s file into a .o file.
Linker combines several .o files and resolves absolute addresses.
Loader loads executable into memory and begins execution.
Dynamically Linked LibrariesStatically linking libraries mean that the library
becomes part of the executable code It loads the whole library even if only a small part is
used (e.g., standard C library is 2.5 MB) What if a new version of the library is released ?
(Lazy) dynamically linked libraries (DLL) – library routines are not linked and loaded until a routine is called during execution The first time the library routine called, a dynamic
linker-loader must- find the desired routine, remap it, and “link” it to the calling
routine (see book for more details) DLLs require extra space for dynamic linking
information, but do not require the whole library to be copied or linked
Dynamically linked libraries
Space/time issues + Storing a program requires less disk space + Sending a program requires less time + Executing two programs requires less memory (if they
share a library) – At runtime, there’s time overhead to do link
Upgrades + Replacing one file (libXYZ.so) upgrades every program
that uses library “XYZ” – Having the executable isn’t enough anymore
This does add quite a bit of complexity to the compiler, linker, and operating system. However, it provides many benefits: