42
The Assembly Language Level Part B – The Assembly Process

The Assembly Language Level Part B – The Assembly Process

Embed Size (px)

Citation preview

The Assembly Language Level

Part B – The Assembly Process

Specifying numeric valuesJava

int i = 10;int j = 0x10;int k = 010;

System.out.println( i );System.out.println( j );

System.out.println( k );

What appears?

MASM

i dword10dj dword10hk dword10oel =20Dm =20Hn =20O

C/C++

int i = 10;int j = 0x10;int k = 010;#define el 20#define m 0x20#define n 020

Specifying numeric valuesJava

int i = 10;int j = 0x10;int k = 010;

System.out.println( i );System.out.println( j );

System.out.println( k );

What appears?

MASM

i dword10dj dword10hk dword10oel =20Dm =20Hn =20O

C/C++

int i = 10;int j = 0x10;int k = 010;#define el 20#define m 0x20#define n 020

10168

FORWARD REFERENCES(when a symbol is used before it is defined)

Two-pass assembler (translator)class test { public static void main ( String args[] ) { System.out.println( "k=" + k ); f(); }

static int k = 72;

static void f ( ) { System.out.println( "in f()" ); }}

valid forward references

Two-pass assembler (translator)

class test {

public static void main ( String args[] ) { System.out.println( "k=" + k ); int k = 72; }

}

invalid forward reference

Two-pass assembler (translator)

• Forward reference– symbol is used before it is defined

• Solutions:1. (One-pass translators make only one pass and don’t allow

any forward references at all.)2. Make two passes (read the source file twice).

• The first pass collects symbol definitions and label locations.• The second pass uses the table built in the first pass.

3. Make one pass over the source and produce an intermediate form.• A second pass is made over this intermediate data.

PASS ONE

Pass one

• Purpose: build the symbol table– a table containing:

• the name of the symbol and its value• or the name of the label and its location

– ILC = Instruction Location Counter– Variable set to 0 and incremented by instruction (or data)

length for each line of code.– Note: code or data may both be labeled.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

Pass One: Two-Pass Assemblers

The instruction location counter (ILC) keeps track of the address where the instructions will be loaded in memory.

In this example, the statements prior to MARIA occupy 100 bytes.

Pass one

• Symbol types:1. Symbol and corresponding value

• Assigned by a pseudoinstruction• Ex. bufsize equ 8192

2. Label and corresponding location;CF flagtest ebx, CF ;carry set?jz cf0 ;jump if 0print SADD(" CF:1") ;flag is onjmp nxt ;skip over else part

cf0:print SADD(" CF:0") ;flag is off

nxt:

Pass one

• Employs 3 (or sometimes 4) tables:1. Symbol table2. Pseudoinstruction table3. Opcode table4. Literal table (optional)

Pass one

• Employs 3 tables:1. Symbol table

• Symbol name• Value (ILC for labels; defn/value for equ)• Length of data field (especially for strings)• Relocation bits (relocatable?)• Scope of symbol

2. Pseudoinstruction table3. Opcode table4. Literal table (optional)

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

Pass One: Two-Pass Assemblers

A symbol table for the program of Fig. 7-7.

Pass one

• Employs 3 tables:1. Symbol table2. Pseudoinstruction table

db 1dw 2dd 4

3. Opcode table4. Literal table (optional)

Pass one

• Employs 3 tables:1. Symbol table2. Pseudoinstruction table3. Opcode table4. Literal table (optional)

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

Pass one: Two-Pass Assemblers

A few excerpts from the opcode table for a Pentium 4 assembler.

Pass one

• Employs 3 tables:1. Symbol table2. Pseudoinstruction table3. Opcode table4. Literal table (optional)

• for pseudoimmediate instructions– when there is not any support for immediate instructions– only loads from memory are allowed

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

Pass One (part 1)

Pass one of a simple assembler.

. . .

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

Pass One (part 2)

Pass one of a simple assembler.

. . .

. . .

(Of course, a line might contain more than one literal!)

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

Pass One (part 3)

Pass one of a simple assembler.. . .

PASS TWO

Pass two

• Purpose:– to generate object code– to optionally generate assembly listing

• Reads in each line, 1 line at a time (from original or intermediate code).

• Processes each line.– writes out binary object code

MASM listing (.lst) file 00000000 .data ;insert variables below

= "<hit return>" prompt equ "<hit return>" ;prompt string

00000000 .code ;insert executable instructions below

00000000 main PROC ;program execution begins here

00000000 B8 00000001 mov eax, 1 ;set regs values

00000005 BB 00000002 mov ebx, 2

0000000A B9 00000003 mov ecx, 3

0000000F BA 00000004 mov edx, 4

00000014 BE 00000005 mov esi, 5

00000019 BF 00000006 mov edi, 6

0000001E E8 00000036 call dump ;show contents of regs

00000000 1 .data

00000000 3C 68 69 74 20 1 ??0019 db prompt, 0

72 65 74 75 72

6E 3E 00

00000000 1 .data?

00000000 1 ??001A db 132 dup (?)

00000023 1 .code

0000003C C6 80 00000000 R 1 mov BYTE PTR [??001A+eax], 0

00

0000004D B8 00000000 R mov eax, input(prompt) ;prompt the user

exit ;end of program

00000059 main ENDP

00000000 .code ;insert executable instructions below

00000000 main PROC ;program execution begins here

00000000 B8 00000001 mov eax, 1 ;set regs values

00000005 BB 00000002 mov ebx, 2

0000000A B9 00000003 mov ecx, 3

0000000F BA 00000004 mov edx, 4

00000014 BE 00000005 mov esi, 5

00000019 BF 00000006 mov edi, 6

00000000 .code ;insert executable instructions below

00000000 main PROC ;program execution begins here

00000000 B8 00000001 mov eax, 1 ;set regs values

00000005 BB 00000002 mov ebx, 2

0000000A B9 00000003 mov ecx, 3

0000000F BA 00000004 mov edx, 4

00000014 BE 00000005 mov esi, 5

00000019 BF 00000006 mov edi, 6

00000000 .data ;insert variables below

= "<hit return>" prompt equ "<hit return>" ;prompt string

00000000 1 .data

00000000 3C 68 69 74 20 1 ??0019 db prompt, 0

72 65 74 75 72

6E 3E 00

ASCII & MASM listing file

Pass two

Possible errors:1. Symbol used but not defined2. Multiply defined symbol3. Unrecognized opcode4. Too few/many operands5. Bad (binary/octal/decimal/hex) #6. Illegal register use (ex. Branch to reg)7. END missing

Pass Two (part 1)Pass two of a simple assembler.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

. . .

Read one entry from the intermediate file.

(Questionable limitations (16 bytes).)

Pass two of a simple assembler.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

Pass Two (part 2). . .

SYMBOL TABLE ORGANIZATION

Symbol table

• Stores <name,value> pairs.• How do we find the value associated with a

particular symbol?• If we store the symbol table in an unordered

array, then linear search requires:– best case: 1– worst case: n– average: n/2 search effort on average– O(n)

Symbol table

• Stores <name,value> pairs.• How do we find the value associated with a

particular symbol?• If we store the symbol table in a sorted array,

then binary search requires:– best case: 1– worst case: log2 (n)

– O(log2 n)

Symbol table

• Stores <name,value> pairs.• How do we find the value associated with a

particular symbol?• If we store the symbol table in a hash table,

then search requires:– best case = worst case = average case = O(1) iff we

have a perfect hash function• H:S-->I (H maps Strings onto the Ints)

Symbol table

• Hash function H:S-->I– H maps Strings onto the Ints– We store our symbols in a table of size k.– Possible hash functions:

ks

ks

ii

ii

%

%

ksi

ksi

ii

ii

%*

%*

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

The Symbol Table (1)

Hash coding. (a) Symbols, values, and the hash codes derived from the symbols.

We need a method to cope with the situation when the hash scheme is less than perfect (i.e., when a collision occurs).

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

The Symbol Table (1)

Hash coding. (a) Symbols, values, and the hash codes derived from the symbols.

We need a method to cope with the situation when the hash scheme is less than perfect (i.e., when a collision occurs).

Chaining: One solution is to make a linked list of symbols that hash to the same value.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

The Symbol Table (2)

Hash coding. (b) Eight-entry hash table with linked lists of symbols and values.

Analysis of hashing with chaining

• If all symbols hash to the same position (i.e., to only one position), what happens?

• Load factor– Given a hash table T with

• m slots that stores• n elements,• we define a load factor f for T as

f = n / m.» Average number of elements in a chain.

Java and hash tables

• http://download.oracle.com/javase/6/docs/api/java/util/Hashtable.html

This example creates a hashtable of names & numbers. It uses names as keys:Hashtable<String, Integer> tbl = new Hashtable< String, Integer >();tbl.put( "fred", 1 );tbl.put( "ethel", 29 );tbl.put( "barney", 3 );

To retrieve a number, use the following code:Integer v = tbl.get( "ethel" );if (v != null) { System.out.println( "ethel = " + v ); }

Note: What’s going on with 1, 29, and 3, and Integer?

C++ and hash tablesSee http://en.wikipedia.org/wiki/Hash_map_%28C%2B%2B%29

#include <hash_map>

struct eqstr { bool operator() ( const char* s1, const char* s2 ) const { return strcmp(s1,s2) == 0; //true when eq }};…hash_map< const char*, int, hash<const char*>, eqstr > tbl;…tbl["fred"] = 1;tbl["ethel"] = 29;tbl["barney"] = 3;…int v = tbl["ethel"];

Or one may useiterator find ( const key_type& k )

orsize_type count ( const key_type& k ) const

to check.

Comparison (Java vs. C++)import java.util.Hashtable;

…Hashtable<String, Integer>

tbl = new Hashtable< String, Integer >();…tbl.put( "fred", 1 );tbl.put( "ethel", 29 );tbl.put( "barney", 3 );…Integer v = tbl.get( "ethel" );if (v != null) { System.out.println( "ethel = " + v ); }

#include <hash_map>

struct eqstr { bool operator() ( const char* s1, const char* s2 ) const { return strcmp(s1,s2)==0; }};…hash_map< const char*, int, hash<const char*>, eqstr >

tbl;…tbl["fred"] = 1;tbl["ethel"] = 29;tbl["barney"] = 3;…int v = tbl["ethel"];//can/should use find or count first to check for// existence

Why?

Nicer syntax.

Objects only!

Objects or primitive types.