Computer Organization andAssembly Language
Lecture 3 – Assembly LanguageFundamentals
Basic Elements of Assembly Language
An assembly language program is composed of :• Constants• Expressions• Literals• Reserved Words• Mnemonics• Identifiers• Directives• Instructions• Comments
Integer Constants
• Integer constants can be written in decimal,hexadecimal, octal or binary, by adding a radix (ornumber base) suffix to the end .
• Radix Suffices:– d decimal (the default)– h hexadecimal– q or o octal– b binary
Examples of Integer Constants
• 26 Decimal• 1Ah Hexadecimal• 1101b Binary• 36q Octal• 2Bh Hexadecimal• 42Q Octal• 36D Decimal• 47d Decimal
Integer Expressions
• An integer expressions is a mathematicalexpressions involving integer values and integeroperators.
• The expressions must be one that can be stored in32 bits (or less).
• The precedence:– ( ) Expressions in Parentheses– +, - Unary Plus and minus– *, /, Mod Multiply, Divide, Modulus– +, - Add, Subtract
Examples of Integer Expressions
(4 + 2) * 6
12 – 1 MOD 5
-5 + 214 + 5 * 2
20-3 + 4 * 6 – 1-35- (3 + 4) * (6 – 1)
316 / 5ValueExpression
Real Number Constants
• There are two types of real number constants:– Decimal reals, which contain a sign followed
by a number with decimal fraction and anexponent:[sign] integer.[integer][exponent]Examples:2. +3.0 -44.2E+05 26.E5
– Encoded reals, which are represented exactlyas they are stored:3F80000r
Characters Constants
• A character constant is a single characterenclosed in single or double quotationmarks.
• The assembler converts it to the equivalentvalue in the binary code ASCII:‘A’“d”
String Constants
• A string constant is a string of charactersenclosed in single or double quotationmarks:‘ABC’“x”“Goodnight, Gracie”‘4096’“This isn’t a test”‘Say “Goodnight, ” Gracie’
Reserved Words
• Reserved words have a special meaning to theassembler and cannot be used for anything otherthan their specified purpose.
• They include:– Instruction mnemonics– Directives– Operators in constant expressions– Predefined symbols such as @data which return
constant values at assembly time.
Identifiers
• Identifiers are names that the programmerchooses to represent variables, constants,procedures or labels.
• Identifiers:– can have 1 to 247 characters– are not case-sensitive– begin with a letter , underscore, @ or $ and can
also contain digits after the first character.– cannot be reserved words
Examples of Identifiers
var1 open_file_main _12345@@myfile $firstCount MAXxVal
Directives
• Directives are commands for the assembler,telling it how to assemble the program.
• Directives have a syntax similar to assemblylanguage but do not correspond to Intel processorinstructions.
• Directives are also case-insensitive:• Examples
.data
.codename PROC
Instructions
• An instruction in Assembly language consists of aname (or label), an instruction mnemonic,operands and a comment
• The general form is:[name] [mnemonic] [operands] [; comment]
• Statements are free-form; i.e, they can be writtenin any column with any number of spaces betweenin each operand as long as they are on one line anddo not pass column 128.
Labels
• Labels are identifiers that serve as place markerswithin the program for either code or data.
• These are replaces in the machine-languageversion of the program with numeric addresses.
• We use them because they are more readable:mov ax, [9020]vs.mov ax, MyVariable
Code Labels
• Code labels mark a particular point withinthe program’s code.
• Code labels appear at the beginning and areimmediately followed by a colon:
target:mov ax, bx… …jmp target
Data Labels
• Labels that appear in the operand field of aninstruction:mov first, ax
• Data labels must first be declared in the datasection of the program:first BYTE 10
Instruction Mnemonics
• Instruction mnemonics are abbreviationsthat identify the operation carried out by theinstruction:mov - move a value to another locationadd - add two valuessub - subtract a value from anotherjmp - jump to a new location in the programmul - multiply two valuescall - call a procedure
Operands
• Operands in an assembly languageinstruction can be:– constants 96
– constant expressions 2 + 4
– registers eax
– memory locations count
Operands and Instructions
• All instructions have a predetermined number ofoperands.
• Some instructions use no operands:stc ; set the Carry Flag
• Some instructions use one operand:inc ax ; add 1 to AX
• Some instructions use two operands:mov count, bx ; add BX to count
Comments
• Comments serve as a way for programmers todocument their programs,
• Comments can be specified:– on a single line, beginning with a semicolon until the
end of the line:stc ; set the Carry Flag
– in a block beginning with the directive COMMENTand a user-specified symbol wchih also ends thecomment:COMMENT !… …!
Example: Adding Three Numbers
TITLE Add And Subtract (AddSub.asm); This program adds and subtracts 32-bit
integers.INCLUDE Irvine32.inc.codemain PROC
mov eax, 10000h ;Copies 10000h into EAXadd eax, 40000h ;Adds 40000h to EAXsub eax, 20000h ; Subtracts 20000h from EAXcall DumpRegs ; Call the procedure DumpRegsexit ; Call Windows procedure Exit
; to halt the programmain ENDP ; marks the end of main
end main ; last line to be assembled
marks the program’s titleTreated like a commentCopies the file’s
contents into the program
Program output
EAX=00030000 EBX=00530000 ECX=0063FF68 EDX=BFFC94C0 ESI=817715DC EDI=00000000 EBP=0063FF78 ESP=0063FE3C EIP=00401024 EFL=00000206 CF=0 SF=0 ZF=0 OF=0
An Alternative AddSubTITLE Add And Subtract (AddSubAlt.asm); This program adds and subtracts 32-bit
integers.
.386 ; Minimum CPU to run this is an Intel 386
.MODEL flat, stdcall ; Protected mode program ; using call Windows calls.STACK 4096 ; The stack is 4096 bytes in sizeExitProcess PROTO, dwExitCode:DWORDDumpRegs PROTO ; ExitProcess is an MS-Windows
; procedure ; DumpRegs is a procedure in
; Irvine32.inc ; dwExitCode is a 32-bit value
.codemain PROC
mov eax, 10000hadd eax, 40000hsub eax, 20000hcall DumpRegs
INVOKE ExitProcess, 0 ; INVOKE is a directive ; that calls procedures.
; Call the ExitProcess ; procedure
; Pass back a return ; code of zero.
main ENDPend main
A Program Template
TITLE Program Template (Template.asm); Program Description:; Author:; Creation Date:; Revisions:; Date: Modified by:INCLUDE Irvine32.inc.data
; (insert variables here).codemain PROC
; (insert executable instructions here)exit
main ENDP; (insert additional procedures here)
END main
Assembling, Linking andRunning Programs
Sourcefile
LinkLibrary
Object File
ListingFile
ExecutableProgram
Mapfile
Output
DOS LoaderLinker
Assem-bler
Assembling and Linking the Program
• A 32-bit assembly language program can beassembled and linked in one step by typing:make32 filename
• A 16-bit assembly language program can beassembled and linked in one step by typing:make16 filename
• Example:make32 addsub
Other Files
• In addition to the .asm file (assembler sourcecode), .obj file (object code) and .exe file(executable file), there are other files created bythe assembler and linker:
• .LST (listing) file – contains the sourcecode and object code of the program– .MAP file – contains information about the
segments being linked– .PDB (Program database) file – contains
supplemental information about the program
Intrinsic Data Types
32-bit signed integerSDWORD
32-bit unsigned integer; also Near pointer inProtected Mode
DWORD
16-bit signed integerSWORD
16-bit unsigned integer; also Near Pointer inReal Mode
WORD
8-bit signed integerSBYTE
8-bit unsigned integerBYTE
UsageType
Intrinsic Data Types (continued)
80-bit (10-byte) IEEE extended realREAL10
64-bit (8-byte) IEEE long realREAL8
32-bit (4-byte) IEEE short realREAL4
80-bit (ten-byte) integerTBYTE
64-bit integerQWORD
48-bit integer ; Far Pointer in Protected modeFWORDUsageType
Defining Data
• A data definition statement allocates storage inmemory for variables.
• We write:[name] directive initializer [, initializer]
• There must be at least one initializer.• If there is no specific intial value, we use the
expression ?, which indicate no special value.• All initializer are converted to binary data by the
assembler.
Defining 8-bit Data
• BYTE and SBYTE are used to allcoate storage foran unsigned or signed 8-bit value:value1 BYTE ‘A’ ; character constantvalue2 BYTE 0 ; smallest unsigned bytevalue3 BYTE 255 ; largest unsigned bytevalue4 SBYTE -128 ; smallest signed bytevalue5 SBYTE +127 ; largest signed bytevalue6 BYTE ? ; no initial value.datavalue7 BYTE 10h ; offset is zerovalue8 BYTE 20h ; offset is 1
db Directive
• db is the older directive for allocatingstorage for 8-bit data.
• It does not distinguish between signed andunsigned data:val1 db 255 ; unsigned byteval2 db -128; signed byte
Multiple Initializers
• If a definition has multiple initializers, thelabel is the offset for the first data item:.datalist BYTE 10, 20, 30, 40
10Value:
Offset 0000 0001 0002 0003
20 30 40
Multiple Initializers (continued)
• Not all definitions need labels:.datalist BYTE 10, 20, 30, 40
BYTE 50, 60, 70, 80BYTE 81, 82, 83, 84
10Value:
Offset 0000 0001 0002 0003
20 30 40 50 60
0004 0005
Multiple Initializers (continued)
• The different initializers can use different radixes:.datalist1 BYTE 10, 32, 41h, 00100010blist2 BYTE 0aH, 20H, ‘A’, 22h
• list1 and list2 will have the identical contents, albeitat different offsets.
Defining Strings
• To create a string data definition, enclose asequence of characters in quotation marks.
• The most common way to end a string is anull byte (0):greeting1 BYTE “Good afternoon”, 0
is the same asgreeting1 BYTE ‘G’, ‘o’, ‘o’, … 0
Defining Strings (continued)
• Strings can be spread over several lines:greeting2 BYTE “Welcome to the Encryption”
BYTE “ Demo program”BYTE “created by Kip Irvine”,\
0dh, 0aHBYTE “ If you wish to modify this”
“ program, please”BYTE “send me a copy”, 0dh, 0ah
Concatenates two lines
Using dup
• DUP repeats a storage allocation howevermany times is specified:BYTE 20 DUP(0) ; 20 bytes of zeroBYTE 20 DUP(?) ; 20 bytes uninitializedBYTE 2 DUP(“STACK”)
; 20 bytes “STACKSTACK”
Defining 16-bit Data
• The WORD and SWORD directives allocate storage ofone or more 16-bit integers:word1 WORD 65535 ; largest unsigned valueword2 SWORD -32768; smallest signed valueword3 WORD ? ; uninitialized value
• The dw directive can be used to allocatedstorage for either signed or unsignedintegers:val1 dw 65535 ; unsignedval2 dw -32768 ; signed
Arrays of Words
• You can create an array of word values bylisting them or using the DUP operator:myList WORD 1, 2, 3, 4, 5
array WORD 5 DUP(?); 5 values, uninitialized
1Value:
Offset 0000 0002 0004 0006
2 3 4 5
0008
Defining 32-bit Data
• The DWORD and SDWORD directives allocate storage ofone or more 32-bit integers:val1 DWORD 12345678h ; unsignedval2 SDWORD -21474836648; signedval3 DWORD 20 DUP(?)
; unsigned array
• The dd directive can be used to allocatedstorage for either signed or unsigned integers:val1 dd 12345678h ; unsignedval2 dw -21474836648 ; signed
Arrays of Doublewords
• You can create an array of word values bylisting them or using the DUP operator:myList DWORD 1, 2, 3, 4, 5
1Value:
Offset 0000 0004 0008 000C
2 3 4 5
0010
Defining 64-bit Data
• The QWORD directive allocate storage of one or more64-bit (8-byte) values:quad1 QWORD 1234567812345678h
• The dq directive can be used to allocatedstorage:quad1 dq 1234567812345678h
Defining 80-bit Data
• The TBYTE directive allocate storage of one or more80-bit integers, used mainly for binary-codeddecimal numbers:val1 TBYTE 1000000000123456789h
• The dq directive can be used to allocatedstorage:val1 dt 1000000000123456789h
Defining Real Number Data
• There are three different ways to define realvalues:– REAL4 defines a 4-byte single-precision real
value.– REAL8 defines a 8-byte double-precision real
value.– REAL10 defines a 10-byte extended double-
precision real value.• Each requires one or more real constant
initializers.
Examples of Real Data Definitions
rVal1 REAL4 -2.1rVal2 REAL8 3.2E-260rVal3 REAL10 4.6E+4096ShortArray REAL4 20 DUP(?)
rVal1 DD -1.2rVal2 dq 3.2E-260rVal3 dt 4.6E+4096
Ranges For Real Numbers
3.37×10-4932 to1.18×104932
19Extended Real
2.23×10-308 to 1.79×1030815Long Real
1.18×10-38 to 3.40×10386Short Real
Approximate RangeSignificantDigits
Data Type
Little Endian Order
• Consider the number 12345678h:
78
56
34
12
0001:
0000:
0002:
0003:
Little-endian
12
34
56
78
0001:
0000:
0002:
0003:
Big-endian
Adding Variables to AddSub
TITLE Add And Subtract (AddSub2.asm); This program adds and subtracts 32-bitintegers.
; and stores the sum in a variableINCLUDE Irvine32.inc.dataval1 DWORD10000hval2 DWORD40000hval3 DWORD20000hfinalVal DWORD?
.codemain PROCmov eax, val1 ; Start with 10000hadd eax, val2 ; Add 40000hsub eax, val3 ; Subtract 2000hmov finalVal, eax ; Save itcall DumpRegs ; Display the
; registersexit
main ENDPend main
Symbolic Constants
• Equate directives allows constants andliterals to be given symbolic names.
• The directives are:– Equal-Sign Directive– EQU Directive– TEXTEQU Directive
Equal-Sign Directive
• The equal-sign directive creates a symbol byassigning a numeric expression to a name.
• The syntax is:name = expression
• The equal sign directive assigns no storage; it justensures that occurrences of the name are replacesby the expression.
Equal-Sign Directive (continued)
• Expression must be expressable as 32-bit integers (this requires a .386 orhigher directive).
• Examples:prod = 10 * 5 ; Evaluates an expressionmaxInt = 7FFFh ; Maximum 16-bit signed valueminInt = 8000h ; Minimum 16-bit signed valuemaxUInt = 0FFFh ; Maximum 16-bit unsigned valueString = ‘XY ’ ; Up to two characters allowedCount = 500endvalue = count + 1 ;Can use a predefined symbol
.386maxLong = 7FFFFFFFh
; Maximum 32-bit signed valueminLong = 80000000h; Minimum 32-bit signed value
maxULong = 0fffffffh; Maximum 32-bit unsigned value
Equal-Sign Directive (continued)
• A symbol defined with an equal-sign directive can beredefined with a different value later within the sameprogram:– Statement: Assembled as:
count = 5mov al, count mov al, 5mov dl, al mov al, dlcount = 10mov cx, count mov cx, 10mov dx, count mov dx, 10count = 2000mov ax, count mov ax, 2000
EQU Directive• The EQU Directive assigns a symbolic name to a string or
numeric constant• Symbols defined using EQU cannot be redefined.• Expressions are evaluated as integer values, but floating
point values are evaluated as strings.• Strings may be enclosed in the brackets < > to ensure their
correct interpretation.• Examples:
Example Type of valuemaxint equ 32767 Numericmaxuint equ 0FFFFh Numericcount equ 10 * 20 Numericfloat1 equ <2.345> String
TEXTEQU Directive• The TEXTEQU directive assigns a name to a sequence of
characters.• Syntax:
name TEXTEQU <text>name TEXTEQU textmacroname TEXTEQU %constExpr
• Textmacro is a predefined text macro (more about this later)constExpr is a numeric expression which is evaluated andused as a string.
• Example:continueMsg textequ <“Do you wish to continue?”>.dataprompt1 db ContinueMsg
TEXTEQU Examples;Symbol declarations:move textequ <mov>address textequ <offset>; Original code:move bx, address valuemove al, 20
; Assembled as:mov bx, offset valuemov al, 20
TEXTEQU Examples (continued)
.datamyString BYTE “A string”, 0.codep1 textequ <offset MyString>
mov bx, p1; bx = offset myString
p1 textequ <0>mov si, p1 ; si = 0
Real-Address ModeProgramming
TITLE Add And Subtract (AddSub3.asm); This program adds and subtracts 32-bit; integers and stores the sum in a; variable. Target : Real ModeINCLUDE Irvine16.inc.dataval1 DWORD10000hval2 DWORD40000hval3 DWORD20000hfinalVal DWORD?
.codemain PROC
mov ax, @datamov ds, ax ; initialize the data
; segment registermov eax, val1 ; Start with 10000hadd eax, val2 ; Add 40000hsub eax, val3 ; Subtract 2000hmov finalVal, eax ; Save itcall DumpRegs ; Display the
; registersexit
main ENDPend main