32
Computer Architecture and Design – ECEN 350 Part 4 [Some slides adapted from M. Irwin, D. Paterson and others]

Computer Architecture and Design – ECEN 350

  • Upload
    toshi

  • View
    43

  • Download
    1

Embed Size (px)

DESCRIPTION

Computer Architecture and Design – ECEN 350. Part 4. [Some slides adapted from M. Irwin, D. Paterson and others]. compiler. assembly code. The Code Translation Hierarchy. C program. Compiler. Transforms the C program into an assembly language program Advantages of high-level languages - PowerPoint PPT Presentation

Citation preview

Page 1: Computer Architecture and Design – ECEN 350

Computer Architecture and Design – ECEN 350

Part 4

[Some slides adapted from M. Irwin, D. Paterson and others]

Page 2: Computer Architecture and Design – ECEN 350

The Code Translation HierarchyC program

compiler

assembly code

Page 3: Computer Architecture and Design – ECEN 350

CompilerTransforms the C program into an assembly

language programAdvantages of high-level languages

much fewer lines of code easier to understand and debug

Today’s optimizing compilers can produce assembly code nearly as good as an assembly language programming expert and often better for large programs smaller code size, faster execution

Page 4: Computer Architecture and Design – ECEN 350

The Code Translation HierarchyC program

compiler

assembly code

assembler

object code

Page 5: Computer Architecture and Design – ECEN 350

AssemblerTransforms symbolic assembler code into object

(machine) codeAdvantages of assembler

much easier than remembering instr’s binary codes can use labels for addresses – and let the assembler

do the arithmetic can use pseudo-instructions

- e.g., “move $t0, $t1” exists only in assembler (would be implemented using “add $t0,$t1,$zero”)

When considering performance, you should count instructions executed, not code size

Page 6: Computer Architecture and Design – ECEN 350

The Two Main Tasks of the AssemblerFinds the memory locations with labels so the

relationship between the symbolic names and their addresses is known Symbol table – holds labels and their

corresponding addresses- A label is local if the object is used only within the file

where its defined. Labels are local by default.- A label is external (global) if it refers to code or data in

another file or if it is referenced from another file. Global labels must be explicitly declared global (e.g., .globl main)

Translates each assembly language statement by combining the numeric equivalent of the opcodes, register specifiers, and labels

Page 7: Computer Architecture and Design – ECEN 350

Example: C Asm Obj Exe Run

#include <stdio.h>int main (int argc, char *argv[]) { int i; int sum = 0;

for (i = 0; i <= 100; i = i + 1) sum = sum + i * i;

printf ("The sum from 0 .. 100 is %d\n", sum);}

Page 8: Computer Architecture and Design – ECEN 350

Example: C Asm Obj Exe Run

.text.align 2.globl main

main:subu $sp,$sp,40sw $ra, 20($sp)sd $a0, 32($sp)sw $0, 24($sp)sw $0, 28($sp)

loop:lw $t6, 28($sp)mul $t7, $t6,$t6lw $t8, 24($sp)addu $t9,$t8,$t7sw $t9, 24($sp)

addu $t0, $t6, 1sw $t0, 28($sp)ble $t0,100, loopla $a0, strlw $a1, 24($sp)jal printfmove $v0, $0lw $ra, 20($sp)addiu $sp,$sp,40j $ra.data.align 0

str: .asciiz "The sum from 0 .. 100 is %d\n"

Page 9: Computer Architecture and Design – ECEN 350

Symbol Table Entries

Label Address main: loop: str: printf: ?

Page 10: Computer Architecture and Design – ECEN 350

Example: C Asm Obj Exe Run

00 addiu $29,$29,-4004 sw $31,20($29)08 sw $4, 32($29)0c sw $5, 36($29)10 sw $0, 24($29)14 sw $0, 28($29)18 lw $14, 28($29)1c multu $14, $1420 mflo $1524 lw $24, 24($29)28 addu $25,$24,$152c sw $25, 24($29)

30 addiu $8,$14, 134 sw $8,28($29)38 slti $1,$8, 101 3c bne $1,$0, loop40 lui $4, l.str44 ori $4,$4,r.str 48 lw $5,24($29)4c jal printf50 add $2, $0, $054 lw $31,20($29) 58 addiu $29,$29,405c jr $31

• Remove pseudoinstructions, assign addresses

Page 11: Computer Architecture and Design – ECEN 350

Symbol Table Entries

Symbol Table Label Addressmain: 0x00000000loop: 0x00000018str: 0x10000430printf: 0x000003b0

Relocation InformationAddress Instr. Type Dependency 0x00000040 lui l.str 0x00000044 ori r.str 0x0000004c jal printf

Page 12: Computer Architecture and Design – ECEN 350

Example: C Asm Obj Exe Run

00 addiu $29,$29,-4004 sw $31,20($29)08 sw $4, 32($29)0c sw $5, 36($29)10 sw $0, 24($29)14 sw $0, 28($29)18 lw $14, 28($29)1c multu $14, $1420 mflo $1524 lw $24, 24($29)28 addu $25,$24,$152c sw $25, 24($29)

30 addiu $8,$14, 134 sw $8,28($29)38 slti $1,$8, 101 3c bne $1,$0, -10 40 lui $4, 409644 ori $4,$4,1072 48 lw $5,24($29)

4c jal 812 50 add $2, $0, $054 lw $31,20($29) 58 addiu $29,$29,405c jr $31

• Edit Addresses: start at 0x0040000

Page 13: Computer Architecture and Design – ECEN 350

Example: C Asm Obj Exe Run0x004000 001001111011110111111111111000000x004004 101011111011111100000000000101000x004008 101011111010010000000000001000000x00400c 101011111010010100000000001001000x004010 101011111010000000000000000110000x004014 101011111010000000000000000111000x004018 100011111010111000000000000111000x00401c 100011111011100000000000000110000x004020 000000011100111000000000000110010x004024 001001011100100000000000000000010x004028 001010010000000100000000011001010x00402c 101011111010100000000000000111000x004030 000000000000000001111000000100100x004034 000000110000111111001000001000010x004038 000101000010000011111111111101110x00403c 101011111011100100000000000110000x004040 001111000000010000010000000000000x004044 100011111010010100000000000110000x004048 000011000001000000000000111011000x00404c 001001001000010000000100001100000x004050 100011111011111100000000000101000x004054 001001111011110100000000001000000x004058 000000111110000000000000000010000x00405c 00000000000000000001000000100001

Page 14: Computer Architecture and Design – ECEN 350

Other Tasks of the AssemblerConverts pseudo-instr’s to legal assembly code

register $at is reserved for the assembler to do thisConverts branches to far away locations into a

branch followed by a jumpConverts instructions with large immediates into

a lui followed by an oriConverts numbers specified in decimal and

hexidecimal into their binary equivalents and characters into their ASCII equivalents

Deals with data layout directives (e.g., .asciiz)Expands macros (frequently used sequences of

instructions)

Page 15: Computer Architecture and Design – ECEN 350

MIPS (spim) Memory AllocationMemory

230

words

0000 0000

f f f f f f f c

TextSegment

Reserved

Static data

Mem Map I/O

0040 0000

1000 00001000 8000

7f f f f f fcStack

Dynamic data

$sp

$gp

PC

Kernel Code & Data

Page 16: Computer Architecture and Design – ECEN 350

The Code Translation HierarchyC program

compiler

assembly code

assembler

object code library routines

executable

linker

machine code

Page 17: Computer Architecture and Design – ECEN 350

Linker Takes all of the independently assembled code

segments and “stitches” (links) them together Faster to recompile and reassemble a patched segment, than it

is to recompile and reassemble the entire program

1. Decides on memory allocation pattern for the code and data modules of each segment Remember, segments were assembled in isolation so each has

assumed its code’s starting location is 0x0000 0000

2. Relocates absolute addresses to reflect the new starting location of the code segment and its data module

3. Uses the symbol tables information to resolve all remaining undefined labels branches, jumps, and data addresses to/in external segments

Linker produces an executable file

Page 18: Computer Architecture and Design – ECEN 350

Linker Code Schematic

rt_1f: . . .

printf: . . .

main: jal ??? . . . jal ???

call, rt_1fcall, printf

Linker

Object file

Object file

C library

Relocation records

main: jal printf . . . jal rt_1fprintf: . . .rt_1f: . . .

Executable file

Page 19: Computer Architecture and Design – ECEN 350

Linker

Step 1: Take text segment from each .o file and put them together.

Step 2: Take data segment from each .o file, put them together, and concatenate this onto end of text segments.

Step 3: Resolve References Go through Relocation Table and handle each entry That is, fill in all absolute addresses

Page 20: Computer Architecture and Design – ECEN 350

Linking Two Object FilesH

dr

T

xtse

g

D

seg

R

eloc

S

mtb

l D

bg

File 1

Hdr

T

xtse

g

Dse

g

Rel

oc S

mtb

l D

bg

File 2+

Executable

Hdr

Tx

tseg

D

seg

Rel

oc

Page 21: Computer Architecture and Design – ECEN 350

Four Types of Addresses we will discuss

PC-Relative Addressing (beq, bne): never relocate

Absolute Address (j, jal): always relocateExternal Reference (usually jal): always

relocateData Reference (often lui and ori): always

relocate

Page 22: Computer Architecture and Design – ECEN 350

Resolving References (1/2)

Linker assumes first word of first text segment is at address 0x00000000.

Linker knows: length of each text and data segment ordering of text and data segments

Linker calculates: absolute address of each label to be jumped to

(internal or external) and each piece of data being referenced

Page 23: Computer Architecture and Design – ECEN 350

Resolving References (2/2)

To resolve references: search for reference (data or label) in all “user”

symbol tables if not found, search library files

(for example, for printf) once absolute address is determined, fill in the

machine code appropriatelyOutput of linker: executable file containing text

and data (plus header)

Page 24: Computer Architecture and Design – ECEN 350

The Code Translation HierarchyC program

compiler

assembly code

assembler

object code library routines

executable

linker

loader

memory

machine code

Page 25: Computer Architecture and Design – ECEN 350

Loader (1/3)

Input: Executable Code(e.g., a.out for MIPS)

Output: (program is run)Executable files are stored on disk.When one is run, loader’s job is to load it into

memory and start it running. In reality, loader is the operating system (OS)

loading is one of the OS tasks

Page 26: Computer Architecture and Design – ECEN 350

Loader (2/3)So what does a loader do?Reads executable file’s header to determine

size of text and data segmentsCreates new address space for program large

enough to hold text and data segments, along with a stack segment

Copies instructions and data from executable file into the new address space (this may be anywhere in memory as we will see later)

Page 27: Computer Architecture and Design – ECEN 350

Loader (3/3)Copies arguments passed to the program onto

the stack Initializes machine registers

Most registers cleared, but stack pointer assigned address of 1st free stack location

Jumps to start-up routine that copies program’s arguments from stack to registers and sets the PC If main routine returns, start-up routine terminates

program with the exit system call In SPIM:

Jumps to a start-up routine (at PC addr 0x0040 0000) that copies the parameters into the argument registers and then calls the main routine of the program with a jal main

Page 28: Computer Architecture and Design – ECEN 350

Things to Remember (1/3)C program: foo.c

Assembly program: foo.s

Executable(mach lang pgm): a.out

Compiler

Assembler

Linker

Loader

Memory

Object(mach lang module): foo.o

lib.o

Page 29: Computer Architecture and Design – ECEN 350

Things to Remember 3/3

Stored Program concept mean instructions just like data, so can take data from storage, and keep transforming it until load registers and jump to routine to begin executionCompiler Assembler Linker ( Loader)

Assembler does 2 passes to resolve addresses, handling internal forward references

Linker enables separate compilation, libraries that need not be compiled, and resolves remaining addresses

Page 30: Computer Architecture and Design – ECEN 350

Things to Remember (2/3)

Compiler converts a single HLL file into a single assembly language file.

Assembler removes pseudoinstructions, converts what it can to machine language, and creates a checklist for the linker (relocation table). This changes each .s file into a .o file.

Linker combines several .o files and resolves absolute addresses.

Loader loads executable into memory and begins execution.

Page 31: Computer Architecture and Design – ECEN 350

Dynamically Linked LibrariesStatically linking libraries mean that the library

becomes part of the executable code It loads the whole library even if only a small part is

used (e.g., standard C library is 2.5 MB) What if a new version of the library is released ?

(Lazy) dynamically linked libraries (DLL) – library routines are not linked and loaded until a routine is called during execution The first time the library routine called, a dynamic

linker-loader must- find the desired routine, remap it, and “link” it to the calling

routine (see book for more details) DLLs require extra space for dynamic linking

information, but do not require the whole library to be copied or linked

Page 32: Computer Architecture and Design – ECEN 350

Dynamically linked libraries

Space/time issues + Storing a program requires less disk space + Sending a program requires less time + Executing two programs requires less memory (if they

share a library) – At runtime, there’s time overhead to do link

Upgrades + Replacing one file (libXYZ.so) upgrades every program

that uses library “XYZ” – Having the executable isn’t enough anymore

This does add quite a bit of complexity to the compiler, linker, and operating system. However, it provides many benefits: