Download pdf - Computer Organization and Assembly Languagehome.adelphi.edu/~siegfried/cs174/174l3.pdfComputer Organization and Assembly Language Lecture 3 – Assembly Language Fundamentals Basic

Computer Organization andAssembly Language

Lecture 3 – Assembly LanguageFundamentals

Basic Elements of Assembly Language

An assembly language program is composed of :• Constants• Expressions• Literals• Reserved Words• Mnemonics• Identifiers• Directives• Instructions• Comments

Integer Constants

• Integer constants can be written in decimal,hexadecimal, octal or binary, by adding a radix (ornumber base) suffix to the end .

• Radix Suffices:– d decimal (the default)– h hexadecimal– q or o octal– b binary

Examples of Integer Constants

• 26 Decimal• 1Ah Hexadecimal• 1101b Binary• 36q Octal• 2Bh Hexadecimal• 42Q Octal• 36D Decimal• 47d Decimal

Integer Expressions

• An integer expressions is a mathematicalexpressions involving integer values and integeroperators.

• The expressions must be one that can be stored in32 bits (or less).

• The precedence:– ( ) Expressions in Parentheses– +, - Unary Plus and minus– *, /, Mod Multiply, Divide, Modulus– +, - Add, Subtract

Examples of Integer Expressions

(4 + 2) * 6

12 – 1 MOD 5

-5 + 214 + 5 * 2

20-3 + 4 * 6 – 1-35- (3 + 4) * (6 – 1)

316 / 5ValueExpression

Real Number Constants

• There are two types of real number constants:– Decimal reals, which contain a sign followed

by a number with decimal fraction and anexponent:[sign] integer.[integer][exponent]Examples:2. +3.0 -44.2E+05 26.E5

– Encoded reals, which are represented exactlyas they are stored:3F80000r

Characters Constants

• A character constant is a single characterenclosed in single or double quotationmarks.

• The assembler converts it to the equivalentvalue in the binary code ASCII:‘A’“d”

String Constants

• A string constant is a string of charactersenclosed in single or double quotationmarks:‘ABC’“x”“Goodnight, Gracie”‘4096’“This isn’t a test”‘Say “Goodnight, ” Gracie’

Reserved Words

• Reserved words have a special meaning to theassembler and cannot be used for anything otherthan their specified purpose.

• They include:– Instruction mnemonics– Directives– Operators in constant expressions– Predefined symbols such as @data which return

constant values at assembly time.

Identifiers

• Identifiers are names that the programmerchooses to represent variables, constants,procedures or labels.

• Identifiers:– can have 1 to 247 characters– are not case-sensitive– begin with a letter , underscore, @ or $ and can

also contain digits after the first character.– cannot be reserved words

Examples of Identifiers

var1 open_file_main _12345@@myfile $firstCount MAXxVal

Directives

• Directives are commands for the assembler,telling it how to assemble the program.

• Directives have a syntax similar to assemblylanguage but do not correspond to Intel processorinstructions.

• Directives are also case-insensitive:• Examples

.data

.codename PROC

Instructions

• An instruction in Assembly language consists of aname (or label), an instruction mnemonic,operands and a comment

• The general form is:[name] [mnemonic] [operands] [; comment]

• Statements are free-form; i.e, they can be writtenin any column with any number of spaces betweenin each operand as long as they are on one line anddo not pass column 128.

Labels

• Labels are identifiers that serve as place markerswithin the program for either code or data.

• These are replaces in the machine-languageversion of the program with numeric addresses.

• We use them because they are more readable:mov ax, [9020]vs.mov ax, MyVariable

Code Labels

• Code labels mark a particular point withinthe program’s code.

• Code labels appear at the beginning and areimmediately followed by a colon:

target:mov ax, bx… …jmp target

Data Labels

• Labels that appear in the operand field of aninstruction:mov first, ax

• Data labels must first be declared in the datasection of the program:first BYTE 10

Instruction Mnemonics

• Instruction mnemonics are abbreviationsthat identify the operation carried out by theinstruction:mov - move a value to another locationadd - add two valuessub - subtract a value from anotherjmp - jump to a new location in the programmul - multiply two valuescall - call a procedure

Operands

• Operands in an assembly languageinstruction can be:– constants 96

– constant expressions 2 + 4

– registers eax

– memory locations count

Operands and Instructions

• All instructions have a predetermined number ofoperands.

• Some instructions use no operands:stc ; set the Carry Flag

• Some instructions use one operand:inc ax ; add 1 to AX

• Some instructions use two operands:mov count, bx ; add BX to count

Comments

• Comments serve as a way for programmers todocument their programs,

• Comments can be specified:– on a single line, beginning with a semicolon until the

end of the line:stc ; set the Carry Flag

– in a block beginning with the directive COMMENTand a user-specified symbol wchih also ends thecomment:COMMENT !… …!

Example: Adding Three Numbers

TITLE Add And Subtract (AddSub.asm); This program adds and subtracts 32-bit

integers.INCLUDE Irvine32.inc.codemain PROC

mov eax, 10000h ;Copies 10000h into EAXadd eax, 40000h ;Adds 40000h to EAXsub eax, 20000h ; Subtracts 20000h from EAXcall DumpRegs ; Call the procedure DumpRegsexit ; Call Windows procedure Exit

; to halt the programmain ENDP ; marks the end of main

end main ; last line to be assembled

marks the program’s titleTreated like a commentCopies the file’s

contents into the program

Program output

EAX=00030000 EBX=00530000 ECX=0063FF68 EDX=BFFC94C0 ESI=817715DC EDI=00000000 EBP=0063FF78 ESP=0063FE3C EIP=00401024 EFL=00000206 CF=0 SF=0 ZF=0 OF=0

An Alternative AddSubTITLE Add And Subtract (AddSubAlt.asm); This program adds and subtracts 32-bit

integers.

.386 ; Minimum CPU to run this is an Intel 386

.MODEL flat, stdcall ; Protected mode program ; using call Windows calls.STACK 4096 ; The stack is 4096 bytes in sizeExitProcess PROTO, dwExitCode:DWORDDumpRegs PROTO ; ExitProcess is an MS-Windows

; procedure ; DumpRegs is a procedure in

; Irvine32.inc ; dwExitCode is a 32-bit value

.codemain PROC

mov eax, 10000hadd eax, 40000hsub eax, 20000hcall DumpRegs

INVOKE ExitProcess, 0 ; INVOKE is a directive ; that calls procedures.

; Call the ExitProcess ; procedure

; Pass back a return ; code of zero.

main ENDPend main

A Program Template

TITLE Program Template (Template.asm); Program Description:; Author:; Creation Date:; Revisions:; Date: Modified by:INCLUDE Irvine32.inc.data

; (insert variables here).codemain PROC

; (insert executable instructions here)exit

main ENDP; (insert additional procedures here)

END main

Assembling, Linking andRunning Programs

Sourcefile

LinkLibrary

Object File

ListingFile

ExecutableProgram

Mapfile

Output

DOS LoaderLinker

Assem-bler

Assembling and Linking the Program

• A 32-bit assembly language program can beassembled and linked in one step by typing:make32 filename

• A 16-bit assembly language program can beassembled and linked in one step by typing:make16 filename

• Example:make32 addsub

Other Files

• In addition to the .asm file (assembler sourcecode), .obj file (object code) and .exe file(executable file), there are other files created bythe assembler and linker:

• .LST (listing) file – contains the sourcecode and object code of the program– .MAP file – contains information about the

segments being linked– .PDB (Program database) file – contains

supplemental information about the program

Intrinsic Data Types

32-bit signed integerSDWORD

32-bit unsigned integer; also Near pointer inProtected Mode

DWORD

16-bit signed integerSWORD

16-bit unsigned integer; also Near Pointer inReal Mode

WORD

8-bit signed integerSBYTE

8-bit unsigned integerBYTE

UsageType

Intrinsic Data Types (continued)

80-bit (10-byte) IEEE extended realREAL10

64-bit (8-byte) IEEE long realREAL8

32-bit (4-byte) IEEE short realREAL4

80-bit (ten-byte) integerTBYTE

64-bit integerQWORD

48-bit integer ; Far Pointer in Protected modeFWORDUsageType

Defining Data

• A data definition statement allocates storage inmemory for variables.

• We write:[name] directive initializer [, initializer]

• There must be at least one initializer.• If there is no specific intial value, we use the

expression ?, which indicate no special value.• All initializer are converted to binary data by the

assembler.

Defining 8-bit Data

• BYTE and SBYTE are used to allcoate storage foran unsigned or signed 8-bit value:value1 BYTE ‘A’ ; character constantvalue2 BYTE 0 ; smallest unsigned bytevalue3 BYTE 255 ; largest unsigned bytevalue4 SBYTE -128 ; smallest signed bytevalue5 SBYTE +127 ; largest signed bytevalue6 BYTE ? ; no initial value.datavalue7 BYTE 10h ; offset is zerovalue8 BYTE 20h ; offset is 1

db Directive

• db is the older directive for allocatingstorage for 8-bit data.

• It does not distinguish between signed andunsigned data:val1 db 255 ; unsigned byteval2 db -128; signed byte

Multiple Initializers

• If a definition has multiple initializers, thelabel is the offset for the first data item:.datalist BYTE 10, 20, 30, 40

10Value:

Offset 0000 0001 0002 0003

20 30 40

Multiple Initializers (continued)

• Not all definitions need labels:.datalist BYTE 10, 20, 30, 40

BYTE 50, 60, 70, 80BYTE 81, 82, 83, 84

10Value:

Offset 0000 0001 0002 0003

20 30 40 50 60

0004 0005

Multiple Initializers (continued)

• The different initializers can use different radixes:.datalist1 BYTE 10, 32, 41h, 00100010blist2 BYTE 0aH, 20H, ‘A’, 22h

• list1 and list2 will have the identical contents, albeitat different offsets.

Defining Strings

• To create a string data definition, enclose asequence of characters in quotation marks.

• The most common way to end a string is anull byte (0):greeting1 BYTE “Good afternoon”, 0

is the same asgreeting1 BYTE ‘G’, ‘o’, ‘o’, … 0

Defining Strings (continued)

• Strings can be spread over several lines:greeting2 BYTE “Welcome to the Encryption”

BYTE “ Demo program”BYTE “created by Kip Irvine”,\

0dh, 0aHBYTE “ If you wish to modify this”

“ program, please”BYTE “send me a copy”, 0dh, 0ah

Concatenates two lines

Using dup

• DUP repeats a storage allocation howevermany times is specified:BYTE 20 DUP(0) ; 20 bytes of zeroBYTE 20 DUP(?) ; 20 bytes uninitializedBYTE 2 DUP(“STACK”)

; 20 bytes “STACKSTACK”

Defining 16-bit Data

• The WORD and SWORD directives allocate storage ofone or more 16-bit integers:word1 WORD 65535 ; largest unsigned valueword2 SWORD -32768; smallest signed valueword3 WORD ? ; uninitialized value

• The dw directive can be used to allocatedstorage for either signed or unsignedintegers:val1 dw 65535 ; unsignedval2 dw -32768 ; signed

Arrays of Words

• You can create an array of word values bylisting them or using the DUP operator:myList WORD 1, 2, 3, 4, 5

array WORD 5 DUP(?); 5 values, uninitialized

1Value:

Offset 0000 0002 0004 0006

2 3 4 5

0008


• The DWORD and SDWORD directives allocate storage ofone or more 32-bit integers:val1 DWORD 12345678h ; unsignedval2 SDWORD -21474836648; signedval3 DWORD 20 DUP(?)

; unsigned array

• The dd directive can be used to allocatedstorage for either signed or unsigned integers:val1 dd 12345678h ; unsignedval2 dw -21474836648 ; signed

Arrays of Doublewords

• You can create an array of word values bylisting them or using the DUP operator:myList DWORD 1, 2, 3, 4, 5

1Value:

Offset 0000 0004 0008 000C

2 3 4 5

0010


• The QWORD directive allocate storage of one or more64-bit (8-byte) values:quad1 QWORD 1234567812345678h

• The dq directive can be used to allocatedstorage:quad1 dq 1234567812345678h


• The TBYTE directive allocate storage of one or more80-bit integers, used mainly for binary-codeddecimal numbers:val1 TBYTE 1000000000123456789h

• The dq directive can be used to allocatedstorage:val1 dt 1000000000123456789h

Defining Real Number Data

• There are three different ways to define realvalues:– REAL4 defines a 4-byte single-precision real

value.– REAL8 defines a 8-byte double-precision real

value.– REAL10 defines a 10-byte extended double-

precision real value.• Each requires one or more real constant

initializers.

Examples of Real Data Definitions

rVal1 REAL4 -2.1rVal2 REAL8 3.2E-260rVal3 REAL10 4.6E+4096ShortArray REAL4 20 DUP(?)

rVal1 DD -1.2rVal2 dq 3.2E-260rVal3 dt 4.6E+4096

Ranges For Real Numbers

3.37×10-4932 to1.18×104932

19Extended Real

2.23×10-308 to 1.79×1030815Long Real

1.18×10-38 to 3.40×10386Short Real

Approximate RangeSignificantDigits

Data Type

Little Endian Order

• Consider the number 12345678h:

78

56

34

12

0001:

0000:

0002:

0003:

Little-endian

12

34

56

78

0001:

0000:

0002:

0003:

Big-endian

Adding Variables to AddSub

TITLE Add And Subtract (AddSub2.asm); This program adds and subtracts 32-bitintegers.

; and stores the sum in a variableINCLUDE Irvine32.inc.dataval1 DWORD10000hval2 DWORD40000hval3 DWORD20000hfinalVal DWORD?

.codemain PROCmov eax, val1 ; Start with 10000hadd eax, val2 ; Add 40000hsub eax, val3 ; Subtract 2000hmov finalVal, eax ; Save itcall DumpRegs ; Display the

; registersexit

main ENDPend main

Symbolic Constants

• Equate directives allows constants andliterals to be given symbolic names.

• The directives are:– Equal-Sign Directive– EQU Directive– TEXTEQU Directive

Equal-Sign Directive

• The equal-sign directive creates a symbol byassigning a numeric expression to a name.

• The syntax is:name = expression

• The equal sign directive assigns no storage; it justensures that occurrences of the name are replacesby the expression.

Equal-Sign Directive (continued)

• Expression must be expressable as 32-bit integers (this requires a .386 orhigher directive).

• Examples:prod = 10 * 5 ; Evaluates an expressionmaxInt = 7FFFh ; Maximum 16-bit signed valueminInt = 8000h ; Minimum 16-bit signed valuemaxUInt = 0FFFh ; Maximum 16-bit unsigned valueString = ‘XY ’ ; Up to two characters allowedCount = 500endvalue = count + 1 ;Can use a predefined symbol

.386maxLong = 7FFFFFFFh

; Maximum 32-bit signed valueminLong = 80000000h; Minimum 32-bit signed value

maxULong = 0fffffffh; Maximum 32-bit unsigned value

Equal-Sign Directive (continued)

• A symbol defined with an equal-sign directive can beredefined with a different value later within the sameprogram:– Statement: Assembled as:

count = 5mov al, count mov al, 5mov dl, al mov al, dlcount = 10mov cx, count mov cx, 10mov dx, count mov dx, 10count = 2000mov ax, count mov ax, 2000

EQU Directive• The EQU Directive assigns a symbolic name to a string or

numeric constant• Symbols defined using EQU cannot be redefined.• Expressions are evaluated as integer values, but floating

point values are evaluated as strings.• Strings may be enclosed in the brackets < > to ensure their

correct interpretation.• Examples:

Example Type of valuemaxint equ 32767 Numericmaxuint equ 0FFFFh Numericcount equ 10 * 20 Numericfloat1 equ <2.345> String

TEXTEQU Directive• The TEXTEQU directive assigns a name to a sequence of

characters.• Syntax:

name TEXTEQU <text>name TEXTEQU textmacroname TEXTEQU %constExpr

• Textmacro is a predefined text macro (more about this later)constExpr is a numeric expression which is evaluated andused as a string.

• Example:continueMsg textequ <“Do you wish to continue?”>.dataprompt1 db ContinueMsg

TEXTEQU Examples;Symbol declarations:move textequ <mov>address textequ <offset>; Original code:move bx, address valuemove al, 20

; Assembled as:mov bx, offset valuemov al, 20

TEXTEQU Examples (continued)

.datamyString BYTE “A string”, 0.codep1 textequ <offset MyString>

mov bx, p1; bx = offset myString

p1 textequ <0>mov si, p1 ; si = 0

Real-Address ModeProgramming

TITLE Add And Subtract (AddSub3.asm); This program adds and subtracts 32-bit; integers and stores the sum in a; variable. Target : Real ModeINCLUDE Irvine16.inc.dataval1 DWORD10000hval2 DWORD40000hval3 DWORD20000hfinalVal DWORD?

.codemain PROC

mov ax, @datamov ds, ax ; initialize the data

; segment registermov eax, val1 ; Start with 10000hadd eax, val2 ; Add 40000hsub eax, val3 ; Subtract 2000hmov finalVal, eax ; Save itcall DumpRegs ; Display the

; registersexit

main ENDPend main