42
x86, Assembler TASM, MASM, NASM

5 Assembler

Embed Size (px)

DESCRIPTION

loki

Citation preview

Page 1: 5 Assembler

x86, Assembler

TASM, MASM, NASM

Page 2: 5 Assembler

Available assembler

MASM Microsoft : Macro Assembler

TASM Borland : Turbo Assembler

NASM Library General Public License (LGPL) [Free] :

Netwide Assembler etc, Flat Assembler, SpAssembler

Page 3: 5 Assembler

MASM: Microsoft Macro Assembler MASM contains a macro language with

looping, arithmetic, text string processing, and so on, and

MASM supports the instruction sets of the 386, 486, and Pentium processors, providing you with greater direct control over the hardware. You also can avoid extra time and memory overhead when using MASM.

http://msdn.microsoft.com/library/en-us/vcmasm/ html/vcoriMicrosoftAssemblerMacroLanguage.asp

Page 4: 5 Assembler

TASM: Turbo Assembler

TASM, Inpise's Borland Turbo Assembler, supports an alternative to MASM emulation. This is known as Ideal mode and provides several advantages over MASM.

The key (questionable) disadvantage, of course, is that MASM style assemblers cannot assemble Ideal mode programs.

Page 5: 5 Assembler

NASM: Netwide Assembler

NASM is designed for portability and modularity. It supports a range of object file formats including Linux, Microsoft 16-bit OBJ and Win32. Its syntax is designed to be simple and easy to understand, similar to Intel's but less complex.

It supports Pentium, P6, MMX, 3DNow! and SSE opcodes, and has macro capability. It includes a disassemble as well.

NASM is Library General Public License (LGPL) [Free]

http://nasm.sourceforge.net

Page 6: 5 Assembler

FASM: Flat Assembler

Currently it supports all 8086-80486/Pentium instructions with MMX, SSE, SSE2, SSE3 and 3DNow! extensions, can produce output in binary, MZ, PE, COFF or ELF format.

It includes the powerful but easy to use macroinstruction support and does multiple passes to optimize the instruction codes for size. The flat assembler is self-compilable and the full source code is included.

http://flatassembler.net/

Page 7: 5 Assembler

CPU’s language (instructions) X86 instruction set

About Complier Directives

MASM TASM NASM

About developing assembly language

Page 8: 5 Assembler

TASM

Page 9: 5 Assembler

Important files

Compiler TASM 16 bits real mode TASM32 32 bits protected mode

Linker TLINK

Page 10: 5 Assembler

Pseudo instructions

Segment, ends : To define a segment.

Assume: To specify which segment defined by “Sengment, ends” should use which segment-register

Data Allocate

Page 11: 5 Assembler

Segment Declaration

Usage Segment_name segment … Segment_name ends

Ex.Cseg segment

Cseg ends

Page 12: 5 Assembler

Label declaration

Usage Label name follow with colon “:”

Ex.Start:…

mov bx, offset start

jmp Start

Page 13: 5 Assembler

Data allocate

Define value DB Define Byte DW Define Word DD Define Doubleword DQ Define Quadword DT Define Ten Bytes

Usage Var_name Dx data

Page 14: 5 Assembler

Ex. Data allocation

dseg segmentMsg db “hello world$”

MulH dw 0, 1, 2, 3

MulF dd 1234h

dseg ends

Page 15: 5 Assembler

Data duplication

Usage type count dup (value)

Ex.data1 db 10 dup (0)

data2 db 2 dup (3 dup (0))

data3 db 3 dup (1, 2, 3 dup (4))

data4 db 4 dup (?)

Page 16: 5 Assembler

Struc PosType Row dw ? Col dw ?Ends PosType

Union PosValType Pos PosType ? Val dd ?Ends PosValType

Point PosValType ?

Structure

Page 17: 5 Assembler

Structure

mov [Point.Pos.Row], bx ;; OK: Move BX to Row component of Point

mov [Point.Pos.Row], bl ;

; Error: mismatched operands

Page 18: 5 Assembler

Data reference

offset directive, To retrieve an offset of a datamov bx, offset msg1 ;dx=offset/addr

To retrieve / put a datamov dx, msg1 ;dx = [msg1]

mov [msg1], dx ;[msg1] = dx

mov [bx+2], dx ;[bx+2] = dx

Page 19: 5 Assembler

Memory contents

ByteVal db ? ;"ByteVal" is name of byte variablemov ax, bx ;OK: Move value of BX to AXmov ax, [bx];OK: Move word at address BX to AX. Size of ;destination is used to generate proper object codemov ax,[word bx];OK: Same as above with unnecessary size qualifiermov ax,[word ptr bx];OK: Same as above with unnecessary size qualifier ;and redundant pointer prefixmov al, [bx];OK: Move byte at address BX to AL. Size of ;destination is used to generate proper object codemov [bx], al ; OK: Move AL to location BX

Page 20: 5 Assembler

Memory contents

mov ByteVal, al;Warning: "ByteVal" needs bracketsmov [ByteVal], al;OK: Move AL to memory location named "ByteVal"mov [ByteVal], ax;Error: unmatched operandsmov al, [bx+2];OK: Move byte from memory location BX+2 to ALmov al, bx[2]; Error: indexes must occur with "+" as abovemov bx, Offset ByteVal;OK: Offset statement does not use bracketsmov bx, Offset [ByteVal]; Error: offset cannot be taken of the contents

of memory

Page 21: 5 Assembler

Memory contents

lea bx, [ByteVal];OK: Load effective address of "ByteVal"lea bx, ByteVal;Error: brackets requiredmov ax, 01234h;OK: Move constant word to AXmov [bx], 012h;Warning: size qualifier needed to determine;whether to populate byte or wordmov [byte bx], 012h;OK: constant 012h is moved to byte at address BXmov [word bx], 012h;OK: constant 012h is moved to word at address BX

Page 22: 5 Assembler

Echo entered string

cseg segmentassume cs:cseg, ds:cseg org 100hstart: jmp loadBuf db 11, 12 dup (' ')_ent db 10,13,’$’ ;lf,cr

load: mov ah,0ah mov dx,offset buf int 21h

mov ah,09h mov dx,load mov dx,offset _ent int 21h

mov al,[buf+1]mov ah,00hmov bx,offset buf+2add bx,axmov byte ptr [bx],'$'

mov ah,09hmov dx,offset buf+2int 21h

int 20hcseg ends

end start

Page 23: 5 Assembler

Compiling a program

Syntax: TASM [options] source [,object] [,listing] [,xref]

/z Display source line with error message /zi,/zd,/zn Debug info: zi=full, zd=line numbers only,

zn=none

Ex TASM –zi hello.asm

Page 24: 5 Assembler

Creating an executable file

TLINK objfiles, exefile, mapfile, libfiles, deffile, resfiles

/v Full symbolic debug information /t Create COM file (same as /Tdc) /Txx Specify output file type

Tdx DOS image (default) x can be e=EXE or c=COM

Twx Windows image x can be e=EXE or d=DLL

Ex Tlink /v /t hello;

Page 25: 5 Assembler

NASM

Page 26: 5 Assembler

NASM vs. MASM & TASM

NASM is case sensitive. NASM Requires Square Brackets For

Memory References No need ‘offset’, either ‘equ’ or ‘address’

mov ax, data ; mov ax, offset data Use square bracket to retrieve content

mov ax, [data] ; Everything is treated as a label instead of var

or equ or else

Page 27: 5 Assembler

NASM vs. MASM & TASM

Does not support hybrid syntaxes, such as mov ax, table [bx] -> mov ax, [table +

ax]

Likewise mov ax, es:[di] -> mov ax, [es:di]

Page 28: 5 Assembler

NASM Doesn't Store Variable Types

NASM, by design, chooses not to remember the types of variables you declare. Whereas MASM will remember, on seeing `var dw 0', that you declared `var' as a word-size variable, and will then be able to fill in the ambiguity in the size of the instruction

‘mov var,2’, NASM will deliberately remember nothing about the symbol ‘var’ except where it begins, and so you must explicitly code

‘mov word [var],2’.

Page 29: 5 Assembler

NASM Doesn't Store Variable Types

For this reason, NASM doesn't support the `LODS', `MOVS', `STOS', `SCAS', `CMPS', `INS', or `OUTS' instructions, but only supports the forms such as `LODSB', `MOVSW', and `SCASD', which explicitly specify the size of the components of the strings being manipulated.

Page 30: 5 Assembler

NASM Doesn't `ASSUME'

As part of NASM's drive for simplicity, it also does not support the ‘ASSUME’ directive.

NASM will not keep track of what values you choose to put in your segment registers, and will never _automatically_ generate a segment override prefix.

Page 31: 5 Assembler

NASM Doesn't Support Memory Models

NASM also does not have any directives to support different 16-bit memory models. The programmer has to keep track of which functions are supposed to be called with a far call and which with a near call, and is responsible for putting the correct form of ‘RET’ instruction (`RETN' or `RETF'; NASM accepts `RET' itself as an alternate form for `RETN'); in addition, the programmer is responsible for coding CALL FAR instructions where necessary when calling _external_ functions, and must also keep track of which external variable definitions are far and which are near.

Page 32: 5 Assembler

Layout of a NASM Source Line

Like most assemblers, each NASM source line contains (unless it is a macro, a preprocessor directive or an assembler directive: some combination of the four fields

label: instruction operands ; comment

Page 33: 5 Assembler

Declaring Initialized Data

DB, DW, DD, DQ and DT are used, much as in MASM, to declare initialized data in the output file. They can be invoked in a wide range of ways:

db 0x55 ; just the byte 0x55 db 0x55,0x56,0x57 ; three bytes in succession db 'a',0x55 ; character constants are OK db 'hello',13,10,'$'; so are string constants dw 0x1234 ; 0x34 0x12 dw 'a' ; 0x61 0x00 (it's just a number) dw 'ab' ; 0x61 0x62 (character constant) dw 'abc' ; 0x61 0x62 0x63 0x00 (string) dd 0x12345678 ; 0x78 0x56 0x34 0x12 dd 1.234567e20 ; floating-point constant dq 1.234567e20 ; double-precision float dt 1.234567e20 ; extended-precision float

Page 34: 5 Assembler

Declaring Uninitialized Data

RESB, RESW, RESD, RESQ and REST are designed to be used in the BSS section of a module: they declare uninitialized storage space.

Each takes a single operand, which is the number of bytes, words, doublewords or whatever to reserve.

NASM does not support the MASM/TASM syntax of reserving uninitialized space by writing `DW ?' or similar things.

Page 35: 5 Assembler

Defining Constants

EQU defines a symbol to a given constant value: when EQU is used, the source line must contain a label. The action of EQU is to define the given label name to the value of its (only) operand.

This definition is absolute, and cannot change later. So, for example,

message db 'hello, world' msglen equ $-message

Page 36: 5 Assembler

Repeating Instructions or Data

The TIMES prefix causes the instruction to be assembled multiple times. This is partly present as NASM's equivalent of the DUP syntax supported by MASM-compatible assemblers, in that you can code zerobuf: times 64 db 0 times 100 movsb ; trivial unrolled loops

Page 37: 5 Assembler

Effective Addresses

An effective address is any operand to an instruction which references memory. Effective addresses, in NASM, have a very simple syntax: they consist of an expression evaluating to the desired address, enclosed in square brackets. For example: wordvar dw 123 mov ax,[wordvar] mov ax,[wordvar+1] mov ax,[es:wordvar+bx]

Page 38: 5 Assembler

Numeric Constants

A numeric constant is simply a number. NASM allows you to specify numbers in a variety of number bases, in a variety of ways: you can suffix H, Q or O, and B for hex, octal and binary, or prefix ‘0x’ or ‘$’ for hex in the style of C and

Pascal Note, a hex number prefixed with a ‘$’ sign must

have a digit after the ‘$’ rather than a letter.

Page 39: 5 Assembler

Ex. Numeric Constants

mov ax,100 ; decimal mov ax,0a2h ; hex mov ax,$0a2 ; hex again

; the 0 is required mov ax,0xa2 ; hex yet again mov ax,777q ; octal mov ax,777o ; octal again mov ax,10010011b ; binary

Page 40: 5 Assembler

Echo entered string

org 0x100start:jmp loadbuf: db 11

resb 12;reserve 12 bytes

_ent: db 10, 13, '$‘

load: mov ah,0ah mov dx,buf int 21h

mov ah,$09mov dx,_entint 21h

mov al,[buf+1]mov ah,0x00mov bx,buf+2add bx,axmov byte [bx],'$'

mov ah,09hmov dx,buf+2int 21hint 20h

Page 41: 5 Assembler

How to NASM…

nasm -f bin program.asm -o program.com nasm -f bin driver.asm -odriver.sys

Page 42: 5 Assembler

Q & A

That’s it for now.