Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Chapter 2: AssemblersChapter 2: Assemblers
System System SoftwareSoftware
Department of Computer ScienceDepartment of Computer Science
National National ChiaoChiao--Tung UniversityTung University
ObjectivesObjectives
•• In this section, you willIn this section, you will
–– Understand design issues for assemblerUnderstand design issues for assembler
•• Hardware dependentHardware dependent
•• Hardware independentHardware independent
–– Understand data structures and algorithms Understand data structures and algorithms –– Understand data structures and algorithms Understand data structures and algorithms
commonly adopted in assemblerscommonly adopted in assemblers
•• OPTABOPTAB
•• SYMTABSYMTAB
•• OnOn--pass and multipass and multi--pass assemblingpass assembling
–– Understand the format of an object fileUnderstand the format of an object file
–– Understand the concept of program relocationUnderstand the concept of program relocation
AssemblersAssemblers
•• Translate assembly language into Translate assembly language into machine codesmachine codes
–– Machine independent featuresMachine independent features
•• OPcodeOPcode lookuplookup
•• Symbol address resolvingSymbol address resolving•• Symbol address resolvingSymbol address resolving
•• Program linkingProgram linking
–– Machine dependent featuresMachine dependent features
•• Instruction format encodingInstruction format encoding
•• Addressing mode assignmentAddressing mode assignment
•• Design issuesDesign issues
–– OneOne--pass or multipass or multi--passpass
SIC AssemblySIC Assembly
•• Index addressing is indicated by adding the Index addressing is indicated by adding the modifier “,X”modifier “,X”–– STCH BUFFER,XSTCH BUFFER,X
•• Lines beginning with . are commentsLines beginning with . are comments
•• Assembler directivesAssembler directives–– No executable machine codes will be assembledNo executable machine codes will be assembled–– No executable machine codes will be assembledNo executable machine codes will be assembled
–– START: specify the starting address of a programSTART: specify the starting address of a program
–– END: specify the end of a program with the address END: specify the end of a program with the address of the first instruction to executeof the first instruction to execute
–– BYTE: generate a oneBYTE: generate a one--byte constantbyte constant
–– WORD: generate a oneWORD: generate a one--word constantword constant
–– RESB: reserve a number of bytesRESB: reserve a number of bytes
–– RESW: reserve a number of wordsRESW: reserve a number of words
SIC AssemblySIC Assembly
•• Figure 2.1:Figure 2.1:–– Read records from a device 0xf1 and copy them to Read records from a device 0xf1 and copy them to
the other device 0x05the other device 0x05
–– Record structureRecord structure•• Each records read from 0xf1 is terminated by 0x00Each records read from 0xf1 is terminated by 0x00
•• If a record begins with 0, then there are no more data to readIf a record begins with 0, then there are no more data to read
–– OperationOperation–– OperationOperation•• A record read from 0xf1 is stored in a bufferA record read from 0xf1 is stored in a buffer
–– BYTE BUFFER[4096]BYTE BUFFER[4096]
–– LENGTH to indicate data lengthLENGTH to indicate data length
•• The record is then written to 0x05The record is then written to 0x05
•• Finally, ‘E’’O’’F’ are written to 0x05Finally, ‘E’’O’’F’ are written to 0x05
–– RemarkRemark•• The program first remembers L register filled by OS, and The program first remembers L register filled by OS, and
then restores it as the return address of the programthen restores it as the return address of the program
SIC AssemblySIC Assembly
•• HighlightsHighlights–– LDL and STLLDL and STL
•• Nested function call must be specially handled in SIC Nested function call must be specially handled in SIC because there is no stack pointer because there is no stack pointer
–– JSUB and RSUB for calling and exiting subroutines, JSUB and RSUB for calling and exiting subroutines, respectivelyrespectively
–– Labels for jumpLabels for jump–– Labels for jumpLabels for jump
–– Directives to reserve memory spaceDirectives to reserve memory space
–– Directives for program start, program end, the place Directives for program start, program end, the place where the execution beginswhere the execution begins
•• START and ENDSTART and END•• The first line of a program is not necessarily the first The first line of a program is not necessarily the first
instruction to be executedinstruction to be executed
•• In assembly, it is very hard to know which In assembly, it is very hard to know which registers serve as “parameter” to subroutinesregisters serve as “parameter” to subroutines–– In practice “push all” and “pop all” are often usedIn practice “push all” and “pop all” are often used
A Simple SIC AssemblerA Simple SIC Assembler
•• The translating of source programs The translating of source programs (assembly) to object code (binary) requires:(assembly) to object code (binary) requires:
–– Convert mnemonics to binary codeConvert mnemonics to binary code
•• STL STL �� 0x140x14
–– Resolving symbols into target addressesResolving symbols into target addresses–– Resolving symbols into target addressesResolving symbols into target addresses
•• RETADDR RETADDR �� 0x10330x1033
–– Evaluating constants Evaluating constants
–– Assemble instructions with correct formatsAssemble instructions with correct formats
A Simple SIC AssemblerA Simple SIC Assembler
•• Resolving target addresses of operandsResolving target addresses of operands
–– Most assemblers read and process source Most assemblers read and process source
programs line by lineprograms line by line
–– Some operands could be used before they Some operands could be used before they
are actually definedare actually definedare actually definedare actually defined
•• A.K.A. Forward referenceA.K.A. Forward reference
FIRST STL RETADRFIRST STL RETADR
• Many assemblers take scan the source program
so as to deal with forward reference
– We shall show how to do it later…
A Simple SIC AssemblerA Simple SIC Assembler
•• Generate object codeGenerate object code
–– Object files contain machine interpretable Object files contain machine interpretable
binary databinary data
•• Binary data could be instructions or dataBinary data could be instructions or data
–– Binary data are loaded into memory for the Binary data are loaded into memory for the –– Binary data are loaded into memory for the Binary data are loaded into memory for the
processor to executeprocessor to execute
•• How data are loaded is controlled by special How data are loaded is controlled by special
records in the object filerecords in the object file
–– Where and how longWhere and how long
•• Header records, text records, and end recordsHeader records, text records, and end records
A Simple SIC AssemblerA Simple SIC Assembler
•• Record formatsRecord formats
Header RecordHeader Record
Col. 1Col. 1 HH
Col. 2Col. 2--77 Program nameProgram name
Col. 8Col. 8--1313 Starting address to load this object program (hex)Starting address to load this object program (hex)
Col. 14Col. 14--1919 Length of this object program in bytes (hex)Length of this object program in bytes (hex)Col. 14Col. 14--1919 Length of this object program in bytes (hex)Length of this object program in bytes (hex)
Text RecordText Record
Col. 1Col. 1 TT
Col. 2Col. 2--77 Starting address to load this recordStarting address to load this record
Col. 8Col. 8--99 Length of this record in bytesLength of this record in bytes
Col. 10Col. 10--6969 Object codesObject codes
End RecordEnd Record
Col. 1Col. 1 EE
Col. 2Col. 2--77 Address of the first instruction to executeAddress of the first instruction to execute
A Simple SIC AssemblerA Simple SIC Assembler
•• ^ is for illustration only^ is for illustration only
•• There is no object code to load between There is no object code to load between 0x10330x1033--0x20380x2038–– The loader could reserve the space for variables, as The loader could reserve the space for variables, as
the assembly program indicatesthe assembly program indicates
A Simple SIC AssemblerA Simple SIC Assembler
•• Given formats of input (assembly programs) and Given formats of input (assembly programs) and output (object files), a simple assembler could output (object files), a simple assembler could bebe–– Pass 1: define symbolsPass 1: define symbols
•• Assign addresses for instructionsAssign addresses for instructions
•• Assign addresses for symbolsAssign addresses for symbols
•• Perform assembler directives such as START, RESB, RESW, Perform assembler directives such as START, RESB, RESW, WORD, BYTE…WORD, BYTE…
–– Pass 2: assemble instructions and write object Pass 2: assemble instructions and write object programprogram
•• Encode instructions in terms of opcodes, target addresses, Encode instructions in terms of opcodes, target addresses, addressing modes as different formatsaddressing modes as different formats
•• Generate data for constantsGenerate data for constants
•• Perform some directives not handled in pass 1Perform some directives not handled in pass 1
•• Write object programs and listing filesWrite object programs and listing files
Assembler Algorithms and Data StructuresAssembler Algorithms and Data Structures
•• OPTABOPTAB
–– Translate mnemonics into binary opcodesTranslate mnemonics into binary opcodes
•• ADDADD��0x180x18
–– Usually readUsually read--onlyonly
–– Should be efficient on retrieval Should be efficient on retrieval –– Should be efficient on retrieval Should be efficient on retrieval
–– Also remembers formats and lengths of Also remembers formats and lengths of
corresponding instructions corresponding instructions
•• E.g., ADDRE.g., ADDR��0x90,format 2, 2 bytes0x90,format 2, 2 bytes
–– Usually implemented as a static hash tableUsually implemented as a static hash table
Assembler Algorithms and Data StructuresAssembler Algorithms and Data Structures
•• SYMTABSYMTAB–– Translate symbol names into target addressesTranslate symbol names into target addresses
•• E.g., in Fig 2.6, CLOOPE.g., in Fig 2.6, CLOOP��0x0006 and BUFFER0x0006 and BUFFER��0x00360x0036
–– Both lookups and insertions are neededBoth lookups and insertions are needed
–– Remembers types (byte or word) and length of Remembers types (byte or word) and length of symbolssymbols
•• E.g., symbol EOF is of 3 bytes and RETADR is of 1 wordE.g., symbol EOF is of 3 bytes and RETADR is of 1 word
–– Usually implemented as a hash tableUsually implemented as a hash table•• Must be efficient to deal with common prefixMust be efficient to deal with common prefix
•• LOCCTRLOCCTR–– Help to assign addresses to symbolsHelp to assign addresses to symbols
–– Initialized as 0Initialized as 0
–– Incremented by the length in bytes of the instruction Incremented by the length in bytes of the instruction just processedjust processed
Assembler Algorithms and Data StructuresAssembler Algorithms and Data Structures
•• Intermediate filesIntermediate files
–– Files storing temporary information between Files storing temporary information between
pass 1 and pass 2pass 1 and pass 2
•• Listing filesListing files
–– Have detailed information about symbol Have detailed information about symbol –– Have detailed information about symbol Have detailed information about symbol
addresses, encoded object codeaddresses, encoded object code
–– Source code and object code are listed sideSource code and object code are listed side--
byby--side for referenceside for reference
begin....read first input line....if OPCODE = 'START' then........begin............save #[OPERAND] as starting address............initialize LOCCTR to starting address............write line to intermediate file............read next input line........end {if START}....else........initialize LOCCTR to 0....while OPCODE <> 'END' do........begin............if this is not a comment line then................begin....................if there is a symbol in the LABEL field then........................begin............................search SYMTAB for LABEL............................if found then................................set error flag (duplicate symbol)............................else................................insert (LABEL, LOCCTR) into SYMTAB........................end {if symbol}....................search OPTAB for OPCODE........................end {if symbol}....................search OPTAB for OPCODE....................if found then........................add 3 {instruction length} to LOCCTR....................else if OPCODE = 'WORD' then........................add 3 to LOCCTR....................else if OPCODE = 'RESW' then........................add 3 * #[OPERAND] to LOCCTR....................else if OPCODE = 'RESB' then........................add #[OPERAND] to LOCCTR....................else if OPCODE = 'BYTE' then........................begin............................find length of constant in bytes............................add length to LOCCTR........................end {if BYTE}....................else........................set error flag (invalid operation code)................end {if not a comment}............write line to intermediate file............read next input line........end {while not END}....write last line to intermediate file....save (LOCCTR - Starting address) as program lengthend {Pass 1}
begin...read first input line {from intermediate file}...if OPCODE = 'START' then.......begin...........write listing line...........read next input line.......end {if START}...write Header record to object program...initialize first Text record...while OPCODE <> 'END' do......begin........if this is not a comment line then..........begin............search OPTAB for OPCODE............ if found then................begin...................if there is a symbol in OPERAND field then......................begin..........................search SYMTAB for OPERAND..........................if found then..............................store symbol value as operand address..........................else..............................begin..................................store 0 as operand address..................................store 0 as operand address..................................set error flag (undefined symbol)..............................end......................end {if symbol}...................else......................store 0 as operand address...................assemble the object code instruction................end {if opcode found}.............else if OPCODE = 'BYTE' or 'WORD' then................convert constant to object code.............if object code will not fit into the current Text record then...............begin...................write Text record to object program...................initialize new Text record...............end............add object code to Text record..........end {if not comment}........write listing line........read next input line......end {while not END}...write last Text record to object program...write end record to object program....write last listing lineend {Pass 2}
How about RESB, RESW?
MachineMachine--Dependent Assembler FeaturesDependent Assembler Features
•• SIC/XE is used to explain machineSIC/XE is used to explain machine--dependent dependent issuesissues–– Indirect addressing is indicated by prefix @Indirect addressing is indicated by prefix @
J @RETADRJ @RETADR
–– Immediate values are with prefix #Immediate values are with prefix #LDA #3LDA #3
–– The extended instruction format is indicated by prefix +The extended instruction format is indicated by prefix +–– The extended instruction format is indicated by prefix +The extended instruction format is indicated by prefix ++LDT #4096+LDT #4096•• Format 3 has only 12 bits for operandsFormat 3 has only 12 bits for operands
–– By default, SIC/XE assembly uses formatBy default, SIC/XE assembly uses format--3 relative 3 relative addressingaddressing
•• Sometimes target addresses are not reachable. In this case, Sometimes target addresses are not reachable. In this case, programmers should consider to use formatprogrammers should consider to use format--4 instructions.4 instructions.
•• The assembler The assembler won’twon’t automatically convert instructions from automatically convert instructions from formatformat--3 to format3 to format--4!!!4!!!
Instruction Formats and Addressing ModesInstruction Formats and Addressing Modes
•• The START specifies the begin address at 0The START specifies the begin address at 0–– It is to generate reIt is to generate re--locatable programslocatable programs
–– The loader takes care of where the begin address of The loader takes care of where the begin address of programsprograms
•• RegisterRegister--toto--register instructions are heavily usedregister instructions are heavily used–– Register names are usually put in SYMTABRegister names are usually put in SYMTAB–– Register names are usually put in SYMTABRegister names are usually put in SYMTAB
•• RegisterRegister--toto--memory instructions use extended memory instructions use extended format or relative addressingformat or relative addressing�� If extended format is specified, use itIf extended format is specified, use it
��Or target addresses are calculated relatively to PC or Or target addresses are calculated relatively to PC or BB
�� If target addresses are too far from the reach of PC or If target addresses are too far from the reach of PC or B, errors are generatedB, errors are generated
Instruction Formats and Addressing ModesInstruction Formats and Addressing Modes
•• Extended instruction format could be used Extended instruction format could be used to address data residing far from (B) and to address data residing far from (B) and (PC)(PC)15 0006 CLOOP +JSUB RDREC 4B10103615 0006 CLOOP +JSUB RDREC 4B101036
–– The symbol “BUFFER” of 4096 bytes gets in The symbol “BUFFER” of 4096 bytes gets in the way, too far to reach by relative the way, too far to reach by relative the way, too far to reach by relative the way, too far to reach by relative addressingaddressing
–– Long instructions could also be used to load Long instructions could also be used to load large immediate valueslarge immediate values
113 103C +LDT #4096 75101000113 103C +LDT #4096 75101000
Instruction Formats and Addressing ModesInstruction Formats and Addressing Modes
•• If formatIf format--4 prefix is not specified, then4 prefix is not specified, then
–– PCPC--relativerelative addressing is consideredaddressing is considered
•• --2048~+20472048~+2047
–– If the target address is can’t be reached by If the target address is can’t be reached by
PCPC--relative addressing, relative addressing, basebase--relativerelativePCPC--relative addressing, relative addressing, basebase--relativerelative
addressing is then consideredaddressing is then considered
•• 0~+40950~+4095
–– If both fail, an If both fail, an errorerror is generatedis generated
•• It’s programmers’ responsibility to It’s programmers’ responsibility to
–– set the base register with proper target addressset the base register with proper target address
–– Put data near instructions which access the dataPut data near instructions which access the data
Instruction Formats and Addressing ModesInstruction Formats and Addressing Modes
•• PCPC--relative addressingrelative addressing10 0000 FIRST STL RETADR 17202D10 0000 FIRST STL RETADR 17202D
40 0017 J CLOOP 3F2FEC40 0017 J CLOOP 3F2FEC
•• Program counter (PC) is incremented Program counter (PC) is incremented afterafter an an instruction is fetched and instruction is fetched and beforebefore the instruction the instruction is executedis executedis executedis executed–– FetchFetch��PC+(n),DecodingPC+(n),Decoding��ExecuteExecute
•• FetchFetch��PC+(n),DecodingPC+(n),Decoding��[Fetch operands][Fetch operands]��ExecuteExecute��[Store result][Store result]
–– When fetching operands from target addresses, the When fetching operands from target addresses, the PC had already been incrementedPC had already been incremented
•• 2D = 302D = 30--3, 63, 6--1A = 1A = --1414
Instruction Formats and Addressing ModesInstruction Formats and Addressing Modes
•• BaseBase--relative addressingrelative addressing
–– Register B could be used as either a Register B could be used as either a
temporary storage or a reference for relative temporary storage or a reference for relative
addressingaddressing
•• PC always refers to the memory address of the PC always refers to the memory address of the •• PC always refers to the memory address of the PC always refers to the memory address of the
next instruction to executenext instruction to execute
•• (B) can be configured as a reference for relative (B) can be configured as a reference for relative
addressing by using assembler directive addressing by using assembler directive BASEBASE
–– The programmer has his freedom to refer B to any legal The programmer has his freedom to refer B to any legal memory address by explicitly loading content into Bmemory address by explicitly loading content into B
•• If B is used for storage, use If B is used for storage, use NOBASENOBASE to tell to tell
assembler not to encode instructions in terms of assembler not to encode instructions in terms of
basebase--relative addressingrelative addressing
Instruction Formats and Addressing ModesInstruction Formats and Addressing Modes
12 0003 LDB #LENGTH 69202D12 0003 LDB #LENGTH 69202D
13 BASE LENGTH13 BASE LENGTH
•• Line 12: read the address of LENGTH instead of its Line 12: read the address of LENGTH instead of its contentcontent–– Note that this instruction uses PCNote that this instruction uses PC--relative addressingrelative addressing
•• Line 13: tell the assembler the following instructions Line 13: tell the assembler the following instructions could be encoded in terms of basecould be encoded in terms of base--relative addressing, relative addressing, where (B) is the address of symbol LENGTHwhere (B) is the address of symbol LENGTHwhere (B) is the address of symbol LENGTHwhere (B) is the address of symbol LENGTH
160 104E STCH BUFFER,X 57C003160 104E STCH BUFFER,X 57C003
•• Line 160: symbol BUFFER is out of the reach of PC and Line 160: symbol BUFFER is out of the reach of PC and therefore basetherefore base--relative addressing is usedrelative addressing is used–– In conjunction with indexed addressingIn conjunction with indexed addressing
Instruction Formats and Addressing ModesInstruction Formats and Addressing Modes
175 1056 EXIT STX LENGTH 134000175 1056 EXIT STX LENGTH 134000
•• Line 175: displacement is 0Line 175: displacement is 0
20 000A LDA LENGTH 03202620 000A LDA LENGTH 032026
•• Line 20: symbol LENGTH is reachable Line 20: symbol LENGTH is reachable •• Line 20: symbol LENGTH is reachable Line 20: symbol LENGTH is reachable from PC (0Dfrom PC (0D��33) and therefore PC33) and therefore PC--relative addressing is usedrelative addressing is used
Program RelocationProgram Relocation
•• It is desirable to load multiple programs It is desirable to load multiple programs into memoryinto memory
–– multiprogrammingmultiprogramming
•• In most cases, the invocations of In most cases, the invocations of programs are not known until runprograms are not known until run--timetimeprograms are not known until runprograms are not known until run--timetime
–– Absolute addressing causes conflictions Absolute addressing causes conflictions
among storage spaces of programsamong storage spaces of programs
•• It is desirable to make programs reIt is desirable to make programs re--locatable anywhere in memorylocatable anywhere in memory
Program RelocationProgram Relocation
20 1006 LDA LENGTH 00103620 1006 LDA LENGTH 001036
•• Direct addressing (SIC), not reDirect addressing (SIC), not re--locatablelocatable
20 000A LDA LENGTH 03202620 000A LDA LENGTH 032026
•• PCPC--relative addressing (SIC/XE format 3), rerelative addressing (SIC/XE format 3), re--
locatablelocatable
Program RelocationProgram Relocation
•• Absolute addressing could be used byAbsolute addressing could be used by–– SIC instructionsSIC instructions
–– Format 4 SIC/XE instructionsFormat 4 SIC/XE instructions
•• Immediate addressing and registerImmediate addressing and register--toto--register register addressing are, surely, readdressing are, surely, re--locatablelocatable
•• If an instruction uses absolute addressing, its If an instruction uses absolute addressing, its •• If an instruction uses absolute addressing, its If an instruction uses absolute addressing, its target address for the operand must be target address for the operand must be “adjusted” for relocation“adjusted” for relocation–– It is impossible to tell reIt is impossible to tell re--locatable instructions from locatable instructions from
non renon re--locatable ones, unlesslocatable ones, unless•• the loader simulates program execution. It is impractical.the loader simulates program execution. It is impractical.
–– Assemblers generate some Assemblers generate some special recordsspecial records in object in object programs to direct loaders to perform such programs to direct loaders to perform such adjustments adjustments
1036
1036
1036
Program RelocationProgram Relocation
•• When an instruction using absolute When an instruction using absolute addressing is assembled, the assembler addressing is assembled, the assembler shouldshould
–– calculate the target address of the operand calculate the target address of the operand
relative to the beginning of the program (i.e., 0)relative to the beginning of the program (i.e., 0)relative to the beginning of the program (i.e., 0)relative to the beginning of the program (i.e., 0)
–– Generate a record so that the instruction can Generate a record so that the instruction can
be “modified” by the loader upon loadingbe “modified” by the loader upon loading
Modification RecordModification Record
Col. 1Col. 1 MM
Col. 2Col. 2--77 Starting location of the target address to be Starting location of the target address to be
modified, relative to the beginning of the programmodified, relative to the beginning of the program
(not relative to the first text record) (not relative to the first text record)
Col. 8Col. 8--99 Length of this record in halfLength of this record in half--bytebyte
Program RelocationProgram Relocation
•• The beginning of the address field could The beginning of the address field could be of odd halfbe of odd half--bytes, in the case bytes, in the case –– the beginning of this field is on halfthe beginning of this field is on half--byte byte
boundariesboundaries
–– the offset field of an M record (in terms of the offset field of an M record (in terms of bytes) is related to the first halfbytes) is related to the first half--byte rather byte rather bytes) is related to the first halfbytes) is related to the first half--byte rather byte rather than the zeroth halfthan the zeroth half--bytebyte
0 8Odd 1
0 8
Odd 1
incorrect
correct
Program RelocationProgram Relocation
•• A reA re--locatable object programlocatable object program
Program RelocationProgram Relocation
•• Not all formatNot all format--4 instructions need modification 4 instructions need modification
recordsrecords
–– +LDA #4096+LDA #4096
•• Of course, instructions using PCOf course, instructions using PC--relative, baserelative, base--
relative addressing, and registerrelative addressing, and register--toto--register register
addressing need no modification recordsaddressing need no modification records
•• For an SIC assembly program, almost all For an SIC assembly program, almost all
instructions need modification recordsinstructions need modification records
•• With modification records, the starting address With modification records, the starting address
of text records are interpreted as being relative of text records are interpreted as being relative
to the beginning of the programto the beginning of the program
Machine Independent Assembler FeaturesMachine Independent Assembler Features
•• Most of machine independent features are Most of machine independent features are for memory address calculations and for memory address calculations and symbol definingsymbol defining
–– LiteralsLiterals
–– SymbolSymbol--defining directivesdefining directives–– SymbolSymbol--defining directivesdefining directives
–– ExpressionsExpressions
–– Program blocksProgram blocks
–– Control section and linkingControl section and linking
LiteralsLiterals
•• Literals representing constants in instructions Literals representing constants in instructions
without explicitly defining a symbol and without explicitly defining a symbol and
reserving memory space for the constantreserving memory space for the constant
–– Must be careful with the differences among symbols, Must be careful with the differences among symbols,
immediate values, and literalsimmediate values, and literals
•• The value of a literal could be directly interpreted The value of a literal could be directly interpreted •• The value of a literal could be directly interpreted The value of a literal could be directly interpreted
from itselffrom itself45 001A ENDFIL LDA 45 001A ENDFIL LDA =C’EOF’ 032010=C’EOF’ 032010
215 1062 WLOOP TD215 1062 WLOOP TD =X’05’ E32011=X’05’ E32011
–– Prefix ‘=‘ is used literalsPrefix ‘=‘ is used literals
–– It could be noticed that literals are not encoded as It could be noticed that literals are not encoded as
immediate valuesimmediate values
LiteralsLiterals
•• Brain damageBrain damage–– Immediate operands: The target addresses of the Immediate operands: The target addresses of the
operands or the constant they specify are encoded as operands or the constant they specify are encoded as a part of instructionsa part of instructions
–– Literals: memory space is implicitly reserved and Literals: memory space is implicitly reserved and constants are accessed via memoryconstants are accessed via memory
•• Symbols with content defined by themselvesSymbols with content defined by themselves
•• Literals in fact provide only convenience to Literals in fact provide only convenience to programmersprogrammersENDFIL LDA ENDFIL LDA =C’EOF’=C’EOF’
ENDFIL LDA ENDFIL LDA EOFEOF
EOF RESB C’EOF’EOF RESB C’EOF’
ENDFIL LDA ENDFIL LDA #X’454F46’ #X’454F46’ if permitted?if permitted?
•• Immediate operands, on the other hand, decide Immediate operands, on the other hand, decide how instructions should be encodedhow instructions should be encoded
LiteralsLiterals
•• The assembler will gather all literals used The assembler will gather all literals used in a program and reserve memory space in a program and reserve memory space for them in a literal poolfor them in a literal pool–– Normally the literal pool is implicitly put at the Normally the literal pool is implicitly put at the
end of a programend of a program
–– A program could have one or more literal A program could have one or more literal –– A program could have one or more literal A program could have one or more literal poolspools
•• Literal pools could be explicitly introduced by Literal pools could be explicitly introduced by programmersprogrammers
•• The reason to do so is to prevent from literals The reason to do so is to prevent from literals being too far away from instructions using thembeing too far away from instructions using them
–– Literal pool could be generated by LTORG Literal pool could be generated by LTORG directivedirective
LiteralsLiterals
•• LTORGLTORG–– Reserve memory space for new literals since Reserve memory space for new literals since
the last LTORG or the beginning of programthe last LTORG or the beginning of program93 LTORG93 LTORG
002D * =C’EOF’ 454F46002D * =C’EOF’ 454F46
–– An implicit LTORG directive is handled at the An implicit LTORG directive is handled at the end of program to flush literals which are not end of program to flush literals which are not handled so farhandled so far
–– Missing LTORG in line 93 of Figure 2.10 will Missing LTORG in line 93 of Figure 2.10 will cause the literal =C’EOF’ being too far away cause the literal =C’EOF’ being too far away from line 45 for PCfrom line 45 for PC--relative addressingrelative addressing
•• Because 4096B BUFFER gets in the wayBecause 4096B BUFFER gets in the way
LiteralsLiterals
•• Literals with different strings sometimes refer to Literals with different strings sometimes refer to the same valuethe same value–– =C’EOF’ and =X’454F46’=C’EOF’ and =X’454F46’
–– To identify (logically) identical literals is a complicated To identify (logically) identical literals is a complicated task for assemblerstask for assemblers
•• Sometimes literals of the same name refer to Sometimes literals of the same name refer to •• Sometimes literals of the same name refer to Sometimes literals of the same name refer to different valuesdifferent values–– Symbol * refers to the current value of program Symbol * refers to the current value of program
countercounterBASE BASE **
LDBLDB =*=*
–– Symbol * refers to the current value of program Symbol * refers to the current value of program counter, on line 13 and line counter, on line 13 and line 5555 refer to 0003 and 0020, refer to 0003 and 0020, respectivelyrespectively
LiteralsLiterals
•• A new data structure LITTAB is adopted for the handling A new data structure LITTAB is adopted for the handling of literalsof literals–– Indexed by literal names (which actually refer to values of them)Indexed by literal names (which actually refer to values of them)
•• Pass 1Pass 1–– When a literal is recognized in operands, LITTAB is checkedWhen a literal is recognized in operands, LITTAB is checked
•• If found, nothing to doIf found, nothing to do
•• If not found, add a new entry for itIf not found, add a new entry for it•• If not found, add a new entry for itIf not found, add a new entry for it
–– A literal pool is created for literals which are not assigned to A literal pool is created for literals which are not assigned to memory addresses when memory addresses when
•• LTORG is handled orLTORG is handled or
•• the end of program is reachedthe end of program is reached
•• Pass 2Pass 2–– When a literal is recognized in operands, LITTAB is checkedWhen a literal is recognized in operands, LITTAB is checked
•• Get its target address assigned in Pass 1 from LITTABGet its target address assigned in Pass 1 from LITTAB
•• If literals are locationIf literals are location--dependent (such as *), modification records dependent (such as *), modification records are generated accordinglyare generated accordingly
–– Why?Why?
SymbolSymbol--Defining DirectivesDefining Directives
•• Why symbolWhy symbol--defining directives are useful?defining directives are useful?
–– We could assign meaningful names to constants to We could assign meaningful names to constants to
improve readabilityimprove readability+LDT+LDT #4096#4096
+LDT+LDT #MAXLEN#MAXLEN
MAXLENMAXLEN EQUEQU 40964096
–– Instead of stating the value of a constant everywhere Instead of stating the value of a constant everywhere
where it is referenced, a symbol could be created to where it is referenced, a symbol could be created to
unify themunify them
•• Avoid potential errors when the program is revised with a Avoid potential errors when the program is revised with a new value of the constantnew value of the constant
–– Since programmers might forget to modify some of them and, Since programmers might forget to modify some of them and,
consequently, runconsequently, run--time errors could be suffered fromtime errors could be suffered from
SymbolSymbol--Defining DirectivesDefining Directives
•• SYMTAB could also be used to deal with SYMTAB could also be used to deal with symbols defined by directive EQU symbols defined by directive EQU
•• A symbol must be defined as a previously A symbol must be defined as a previously defined symbol, a constant, an expression, defined symbol, a constant, an expression, or a literalor a literalor a literalor a literal–– MAXLENMAXLEN EQUEQU 40964096
–– BUFEND BUFEND EQUEQU **
–– MAXLENMAXLEN EQUEQU BUFENDBUFEND--BUFFERBUFFER
•• No memory space is reserved for themNo memory space is reserved for them
SymbolSymbol--Defining DirectivesDefining Directives
•• Forward reference is not permitted!Forward reference is not permitted!ALPHAALPHA EQUEQU BETABETA
BETA BETA EQUEQU DELTADELTA
DELTADELTA RESWRESW 11
–– The target address of a symbol might depend The target address of a symbol might depend on another symbolon another symbolon another symbolon another symbol
•• ALPHA depends on BETA, and BETA in turn ALPHA depends on BETA, and BETA in turn depends on DELTAdepends on DELTA
–– Not target addresses of all symbols could be Not target addresses of all symbols could be resolved in the first passresolved in the first pass
•• Multiple passes might be requiredMultiple passes might be required
•• Cyclic dependency is an errorCyclic dependency is an error
SymbolSymbol--Defining DirectivesDefining Directives
•• An example that exercises EQU:An example that exercises EQU:
–– Accessing a collection of 100 structures, in which one Accessing a collection of 100 structures, in which one
structure is of fields SYMBOL, VALUE, and FLAGSstructure is of fields SYMBOL, VALUE, and FLAGS
SYMBOLSYMBOL VALUEVALUE FLAGSFLAGS
STABSTAB RESBRESB 11001100
SYMBOLSYMBOL EQUEQU STABSTAB
VALUEVALUE EQUEQU STAB+6STAB+6
FLAGSFLAGS EQUEQU STAB+9STAB+9
……
LDALDA VALUE,XVALUE,X
…… …… ……
SymbolSymbol--Defining DirectivesDefining Directives
•• A collection of structures could better be A collection of structures could better be represented by using a new directive ORGrepresented by using a new directive ORG–– ORG is to reset LOCCTR to a desired value or the ORG is to reset LOCCTR to a desired value or the
SYMBOLSYMBOL VALUEVALUE FLAGSFLAGS
…… …… ……
–– ORG is to reset LOCCTR to a desired value or the ORG is to reset LOCCTR to a desired value or the target address of an operandtarget address of an operand
•• ORG valueORG value
–– Simply specifying “ORG” in assembly tells the Simply specifying “ORG” in assembly tells the assembler to restore the old value of LOCCTR before assembler to restore the old value of LOCCTR before “ORG something” is processed“ORG something” is processed
•• We could define symbols over previously We could define symbols over previously reserved memory space by using ORGreserved memory space by using ORG
SymbolSymbol--Defining DirectivesDefining DirectivesSTABSTAB RESBRESB 11001100
ORGORG STABSTAB
SYMBOLSYMBOL RESBRESB 66
VALUEVALUE RESWRESW 11
FLAGSFLAGS RESBRESB 22
ORGORG STAB+1100STAB+1100
…………
LDALDA VALUE,XVALUE,X
•• The type and size of a field could better be The type and size of a field could better be
stated in assembly in terms of readabilitystated in assembly in terms of readability
•• No forward reference should be applied to ORG, No forward reference should be applied to ORG,
because it causes that LOCCTR depends on the because it causes that LOCCTR depends on the
symbol specified in the operandsymbol specified in the operand
ExpressionsExpressions•• The value of a symbol could be an arithmetic expression The value of a symbol could be an arithmetic expression
over several termsover several terms–– +,+,--,*,/,*,/–– Constants, symbols, special terms (e.g., *)Constants, symbols, special terms (e.g., *)
**Be very careful with the terminology in the following contents****Be very careful with the terminology in the following contents**
•• A A relativerelative term refers one being relevant to the beginning term refers one being relevant to the beginning location of the programlocation of the programlocation of the programlocation of the program–– Relative terms are not allowed to be involved in * or / Relative terms are not allowed to be involved in * or / ��(a)(a)–– Any expression must have at most one relative term and it must Any expression must have at most one relative term and it must
be associated with + be associated with + ��(b)(b)
•• An An absoluteabsolute term is independent of where the program is term is independent of where the program is loadedloaded–– If relative terms are paired in opposite operations (+ and If relative terms are paired in opposite operations (+ and --) then ) then
the resulted expression is absolutethe resulted expression is absolute•• To cancel the dependency of the beginning location of the programTo cancel the dependency of the beginning location of the program
•• Violating either (a) or (b) is an errorViolating either (a) or (b) is an error
ExpressionsExpressions
107107 MAXLENMAXLEN EQUEQU BUFFENDBUFFEND--BUFFERBUFFER
•• MAXLEN is an absolute term, calculating MAXLEN is an absolute term, calculating the difference between BUFFEND and the difference between BUFFEND and BUFFERBUFFER
BUFFEND+BUFFERBUFFEND+BUFFER
100100--BUFFERBUFFER
3*BUFFER3*BUFFER
•• The above 3 expressions are considered The above 3 expressions are considered as errorsas errors
ExpressionsExpressions
•• The assembler must have types of The assembler must have types of symbols stored in SYMTABsymbols stored in SYMTAB
–– Either relative or absoluteEither relative or absolute
–– For the generation of modification recordsFor the generation of modification records
SYMBOLSYMBOL TYPETYPE ADDRESSADDRESS
RETADRRETADR RR 00300030
BUFFERBUFFER RR 00360036
BUFFENDBUFFEND RR 10361036
MAXLENMAXLEN AA 10001000
Program BlocksProgram Blocks
•• So far the generated instructions and data So far the generated instructions and data
appeared in the same order as they are in the appeared in the same order as they are in the
source assemblysource assembly
–– Programmers put data and codes that are logically Programmers put data and codes that are logically
close to one another together in source assemblyclose to one another together in source assembly
–– In hardware point of view, it usually is not a good In hardware point of view, it usually is not a good –– In hardware point of view, it usually is not a good In hardware point of view, it usually is not a good
placementplacement
•• Large data area usually gets in the way of relative Large data area usually gets in the way of relative addressingaddressing
•• Program blocks are segments of code that can Program blocks are segments of code that can
be rebe re--arranged when they are loadedarranged when they are loaded
–– Order of program blocks in memory could be different Order of program blocks in memory could be different
from that in source assemblyfrom that in source assembly
Program BlocksProgram Blocks
•• Directive “USE” indicates the program Directive “USE” indicates the program block which the following instructions and block which the following instructions and data are assigned todata are assigned to–– Source assembly is assumed to use the Source assembly is assumed to use the
unnamed (default) block unless explicitly unnamed (default) block unless explicitly specifiedspecifiedspecifiedspecified
–– When specifying a different block, the When specifying a different block, the following instructions and data are switched following instructions and data are switched from the old block to the specified block (Line from the old block to the specified block (Line 92)92)
–– The previously discontinued block could be The previously discontinued block could be resumed by specifying its name again (Line resumed by specifying its name again (Line 123)123)
Program BlocksProgram Blocks
•• The assembler should gather scattered The assembler should gather scattered instructions and data belonging to the instructions and data belonging to the same program blocksame program block
–– When the program is loaded, program blocks When the program is loaded, program blocks
are are rearrangedrearranged in the order in which they first in the order in which they first are are rearrangedrearranged in the order in which they first in the order in which they first
appearappear
–– Physically, the effect is the same as if the Physically, the effect is the same as if the
programmers did the arrangement in source programmers did the arrangement in source
assemblyassembly
Program BlocksProgram Blocks
•• Implementing the handling of program blocksImplementing the handling of program blocks
–– Pass 1Pass 1
•• Maintaining Maintaining differentdifferent LOCCTR’s for different program blocksLOCCTR’s for different program blocks
•• A new LOCCTR is created whenever a new program block A new LOCCTR is created whenever a new program block appearsappears
•• When switching to another program block, LOCCTR of the When switching to another program block, LOCCTR of the •• When switching to another program block, LOCCTR of the When switching to another program block, LOCCTR of the discontinued program block is saved and LOCCTR of the discontinued program block is saved and LOCCTR of the resumed program is restoredresumed program is restored
•• By the end of pass 1, By the end of pass 1, lengths of program blockslengths of program blocks could be could be determined and so are their starting addressesdetermined and so are their starting addresses
–– Pass 2Pass 2
•• Target addresses of operands are relative to the beginning of Target addresses of operands are relative to the beginning of the program, instead of the beginning of its corresponding the program, instead of the beginning of its corresponding program blockprogram block
Program BlocksProgram Blocks
•• At the end of Pass 1, information about the At the end of Pass 1, information about the program blocks are as followsprogram blocks are as follows
Block nameBlock name Block NumberBlock Number AddressAddress LengthLength
(default)(default) 00 00000000 00660066
CDATACDATA 11 00660066 000B000B
CBLKSCBLKS 22 00710071 10001000
2020 0006 0 LDA0006 0 LDA LENGTH 032060LENGTH 032060
CBLKSCBLKS 22 00710071 10001000
(default) CDATAL
EN
GT
H
3
LD
A
6 9
66
60
Program BlocksProgram Blocks
•• The benefits of using program blocks are twofoldThe benefits of using program blocks are twofold–– Physically, the need for complicated addressing Physically, the need for complicated addressing
modes such as formatmodes such as format--4 addressing and base4 addressing and base--relative relative addressing could be reducedaddressing could be reduced
•• Move the large memory space reserved for BUFFER to Move the large memory space reserved for BUFFER to somewhere elsesomewhere else
–– Logically, the source assembly could still maintain its Logically, the source assembly could still maintain its –– Logically, the source assembly could still maintain its Logically, the source assembly could still maintain its readabilityreadability
•• Because symbol BUFFER is still near instructions accessing Because symbol BUFFER is still near instructions accessing itit
•• The same effect could be achieved by inserting The same effect could be achieved by inserting data near instructions accessing themdata near instructions accessing them–– However, programmers must provide jump However, programmers must provide jump
instructions to get around datainstructions to get around data
Program BlocksProgram Blocks
•• Text records should reflect the starting Text records should reflect the starting address of program blocksaddress of program blocks
–– The first four are for (default)The first four are for (default)
–– The fifth is for CDATAThe fifth is for CDATA
–– The sixth is for (default)The sixth is for (default)–– The sixth is for (default)The sixth is for (default)
–– The last is for CDATAThe last is for CDATA
–– No text records are generated for CBLKSNo text records are generated for CBLKS
default
default
defaultdefault
default
CDATA
CDATA
2 text records�
2 text records�
Control SectionsControl Sections
•• Control sections vs. program blocksControl sections vs. program blocks
–– Control sections are very like separate assembly Control sections are very like separate assembly
source filessource files
•• Files (control sections) can be Files (control sections) can be assembled and loaded assembled and loaded independentlyindependently
•• Where a control section is loaded (and target addresses of Where a control section is loaded (and target addresses of •• Where a control section is loaded (and target addresses of Where a control section is loaded (and target addresses of symbols in it) is unknown until runtimesymbols in it) is unknown until runtime
–– A control section could be of a number of program A control section could be of a number of program
blocksblocks
•• Program blocks could be rearranged inside a control sectionProgram blocks could be rearranged inside a control section
•• Control sections could be loaded anywhere in memoryControl sections could be loaded anywhere in memory
–– A control section could be shared by a number of A control section could be shared by a number of
control sectionscontrol sections
Control SectionsControl Sections
•• Instructions in one control section could Instructions in one control section could refer to a symbol defined in another refer to a symbol defined in another control sectioncontrol section
–– Referred to as Referred to as external referenceexternal reference
–– The target address of the symbol could not be The target address of the symbol could not be –– The target address of the symbol could not be The target address of the symbol could not be
known until runtimeknown until runtime
–– The assembler must generate some auxiliary The assembler must generate some auxiliary
information to help the linking loader to information to help the linking loader to
resolve external referencesresolve external references
Control SectionsControl Sections
•• Three control sections are identified in Fig 2.16Three control sections are identified in Fig 2.16–– COPY, RDREC, WRRECCOPY, RDREC, WRREC
–– The assembler maintain The assembler maintain separate LOCCTRseparate LOCCTR beginning beginning at 0 for different control sectionsat 0 for different control sections
•• Much like that for program blocksMuch like that for program blocks
–– For example, instructions in COPY might reference For example, instructions in COPY might reference symbols in RDRECsymbols in RDRECsymbols in RDRECsymbols in RDREC
•• Two new directive EXTDEF and EXTREF are Two new directive EXTDEF and EXTREF are introducedintroduced–– EXTDEFEXTDEF names symbols which are defined in this names symbols which are defined in this
control section and might be referred to in another control section and might be referred to in another control sectioncontrol section
–– EXTREFEXTREF names symbols which are referred to in this names symbols which are referred to in this section and are defined in some other control section and are defined in some other control sectionssections
Control SectionsControl Sections55 COPYCOPY STARTSTART 00
66 EXTDEFEXTDEF BUFFER,BUFEND,LENGTHBUFFER,BUFEND,LENGTH
•• Exposes three symbols in COPY to other control Exposes three symbols in COPY to other control
sectionssections1515 00030003 CLOOPCLOOP +JSUB RDREC+JSUB RDREC 4B1000004B100000
•• Symbol RDREC is referenced in COPY but Symbol RDREC is referenced in COPY but •• Symbol RDREC is referenced in COPY but Symbol RDREC is referenced in COPY but
defined in RDRECdefined in RDREC
–– The assembler insert The assembler insert 00 to the target addressto the target address
–– Relative addressing is Relative addressing is notnot possible!!possible!!
•• External references should be used in conjunction with long External references should be used in conjunction with long instructionsinstructions
–– Necessary information must be provided to linking Necessary information must be provided to linking
loader so as to correct this during runtimeloader so as to correct this during runtime
Control SectionsControl Sections160 0017 +STCH BUFFER,X160 0017 +STCH BUFFER,X 5790000057900000
•• SimilarSimilar
190 0028 MAXLEN WORD BUFEND190 0028 MAXLEN WORD BUFEND--BUFFER 000000BUFFER 000000
•• Neither BUFEND nor BUFFER could be Neither BUFEND nor BUFFER could be resolvedresolved–– Linking loader needs to perform Linking loader needs to perform 22 modificationsmodifications–– Linking loader needs to perform Linking loader needs to perform 22 modificationsmodifications
•• +BUFEND and then +BUFEND and then ––BUFFERBUFFER
107 1000 MAXLEN EQU BUFEND107 1000 MAXLEN EQU BUFEND--BUFFERBUFFER
•• No object code is generated for this lineNo object code is generated for this line–– MAXLEN is defined in both COPY and RDREC but it MAXLEN is defined in both COPY and RDREC but it
arises arises nono errors of reerrors of re--defining symbolsdefining symbols
–– Brain damage: if operands of EQU involve external Brain damage: if operands of EQU involve external symbols?symbols?
Control SectionsControl Sections
•• Two new recordsTwo new records are introduced to direct are introduced to direct the linking loader about how external the linking loader about how external references could be resolvedreferences could be resolved
Define RecordDefine Record
Col. 1Col. 1 DDCol. 1Col. 1 DD
Col. 2Col. 2--77 Name of external symbol defined in this control sectionName of external symbol defined in this control section
Col. 8Col. 8--1313 Relative address of symbol within this control section (hex)Relative address of symbol within this control section (hex)
Col. 14Col. 14--7373 Repeat information in Col 2Repeat information in Col 2--13 for other external symbols13 for other external symbols
Refer recordRefer record
Col. 1Col. 1 RR
Col. 2Col. 2--77 Name of external symbol referred in this control sectionName of external symbol referred in this control section
Col. 8Col. 8--7373 Name of other external reference symbolsName of other external reference symbols
Control SectionsControl Sections
•• The format of modification record is The format of modification record is revisedrevised to to support the handling of external referencessupport the handling of external references
•• Relocation for formatRelocation for format--4 instructions without 4 instructions without external references could be handled by external references could be handled by referring to the referring to the control section’s namecontrol section’s name–– M00000705+COPYM00000705+COPY
Modification Record (revised)Modification Record (revised)
Col. 1Col. 1 MM
Col. 2Col. 2--77 Starting location of the target address to be Starting location of the target address to be
modified, relative to the beginning of the programmodified, relative to the beginning of the program
(not relative to the first text record) (not relative to the first text record)
Col. 8Col. 8--99 Length of this record in halfLength of this record in half--bytebyte
Col. 10Col. 10 Modification flag (+ or Modification flag (+ or --))
Col. 11Col. 11--1616 External symbol whose value is to be added to or External symbol whose value is to be added to or
subtracted from the indicated fieldsubtracted from the indicated field
Assembler Design OptionsAssembler Design Options
•• OneOne--pass assembler vs. multipass assembler vs. multi--pass pass assemblerassembler
–– The major difference between them is how The major difference between them is how
target addresses of symbols are resolvedtarget addresses of symbols are resolved
–– Multiple passes might be needed for forward Multiple passes might be needed for forward –– Multiple passes might be needed for forward Multiple passes might be needed for forward
referencesreferencesALPHAALPHA EQUEQU BETABETA
BETA BETA EQUEQU DELTADELTA
DELTADELTA RESWRESW 11
OneOne--Pass AssemblersPass Assemblers
•• Forward referencesForward references involve data items, involve data items, labels on instructions, or other symbolslabels on instructions, or other symbols
–– To prohibit forward references to data items is To prohibit forward references to data items is
not very restrictivenot very restrictive
•• E.g., defining all data before instructions (like CE.g., defining all data before instructions (like C--•• E.g., defining all data before instructions (like CE.g., defining all data before instructions (like C--
programs do)programs do)
–– To eliminate forward references to labels on To eliminate forward references to labels on
instructions could be too restrictiveinstructions could be too restrictive
•• E.g., some conditionE.g., some condition--guarded loops might skip guarded loops might skip
aheadahead
OneOne--Pass AssemblersPass Assemblers
•• OneOne--pass assemblers could produce pass assemblers could produce object codes either in object codes either in memorymemory or to or to external storageexternal storage–– OneOne--pass assemblers usually need to modify pass assemblers usually need to modify
object code already generated, so whether object code already generated, so whether object code is stored in memory or external object code is stored in memory or external object code is stored in memory or external object code is stored in memory or external storage imposes difference considerations on storage imposes difference considerations on assembler designassembler design
•• LoadLoad--andand--go assemblers produce object go assemblers produce object code in memory and then start executing code in memory and then start executing immediatelyimmediately–– Useful for intensive testingUseful for intensive testing
OneOne--Pass AssemblersPass Assemblers
•• The handling of forward reference for loadThe handling of forward reference for load--andand--
go assemblersgo assemblers
–– When an instruction referring to an undefined symbol:When an instruction referring to an undefined symbol:
•• The symbol is entered in SYMTAB and The symbol is entered in SYMTAB and flagged as a flagged as a undefinedundefined oneone
•• Its Its target address is omittedtarget address is omitted in the object codein the object code•• Its Its target address is omittedtarget address is omitted in the object codein the object code
–– With respect to a not yet defined symbol, all With respect to a not yet defined symbol, all
instructions refer to it are instructions refer to it are chainedchained after its entry in after its entry in
SYMTABSYMTAB
•• Once the symbol is defined, the correct target address will be Once the symbol is defined, the correct target address will be inserted as operands for those instructionsinserted as operands for those instructions
–– If there is any undefined symbol after the entire If there is any undefined symbol after the entire
source assembly is scanned, errors would be source assembly is scanned, errors would be
generatedgenerated
160 204F JEQ EXIT 30205B
OneOne--Pass AssemblersPass Assemblers
•• For loadFor load--andand--go assemblers,go assemblers,
–– The actual beginning address of a program is The actual beginning address of a program is
known before the source program is known before the source program is
assembledassembled
•• Initial value for LOCCTR will be accordingly Initial value for LOCCTR will be accordingly •• Initial value for LOCCTR will be accordingly Initial value for LOCCTR will be accordingly
adjustedadjusted
–– Issues of Issues of program relocation could be ignoredprogram relocation could be ignored
since no object programs are storedsince no object programs are stored
•• The generated object code is useful only at that The generated object code is useful only at that
timetime
OneOne--Pass AssemblersPass Assemblers
•• For oneFor one--pass assemblers which store object pass assemblers which store object programs in external storage (e.g., files on disks)programs in external storage (e.g., files on disks)–– We can not assume that the object program can be We can not assume that the object program can be
randomly accessedrandomly accessed•• Maybe object programs are stored on tapes or an archive file?Maybe object programs are stored on tapes or an archive file?
•• Therefore, random updates to operands’ target addresses Therefore, random updates to operands’ target addresses (as load(as load--andand--go assemblers do) are not permittedgo assemblers do) are not permitted(as load(as load--andand--go assemblers do) are not permittedgo assemblers do) are not permitted
–– For any symbol involved in forward referencesFor any symbol involved in forward references•• Once the target address of the symbol is identified, additional Once the target address of the symbol is identified, additional
text records must be generated to text records must be generated to overwriteoverwrite those previously those previously omitted target addressesomitted target addresses
•• Caution: records must be loaded Caution: records must be loaded in the same orderin the same order as they as they appear in the object programappear in the object program
–– Actually, the handling of forward references are jointly Actually, the handling of forward references are jointly done by the assembler and the linking loaderdone by the assembler and the linking loader
ENDFIL
RDREC
EXIT
WRREC
Fig 2.20: how text is patched by additional text records
WRREC
MultiMulti--Pass AssemblersPass Assemblers
•• A multiA multi--pass assembler is able to handle pass assembler is able to handle forward reference in directives like EQUforward reference in directives like EQU
–– Forward references in Forward references in symbol defining symbol defining
directivesdirectives introduces additional dependenciesintroduces additional dependencies
•• From symbols to symbolsFrom symbols to symbols•• From symbols to symbolsFrom symbols to symbols
•• Forward references in ordinary instructions Forward references in ordinary instructions
introduces dependencies from symbols to introduces dependencies from symbols to
operands in instructionsoperands in instructions
–– Dependencies from instructions to anything are obviously Dependencies from instructions to anything are obviously not possiblenot possible
•• Dependencies of symbolDependencies of symbol��symbol symbol must first be must first be
resolved and then dependencies of resolved and then dependencies of symbolsymbol��instructioninstruction
Dependencies, illustratedDependencies, illustrated
LEGEND
symbol
instruction
dependency•Cycles are not permitted
•No edges from rectangles to anything
•Without dependencies among circles, any path is no longer than 1
• Multiple passes are needed whenever path length > 1
MultiMulti--Pass AssemblersPass Assemblers
•• A multiA multi--pass assembler does not pass assembler does not necessarily scan the source assembly necessarily scan the source assembly multiple timesmultiple times
–– Similar techniques as adopted in oneSimilar techniques as adopted in one--pass pass
assemblers could be used to resolve assemblers could be used to resolve assemblers could be used to resolve assemblers could be used to resolve
dependencies among symbolsdependencies among symbols
–– Of course, the two kinds of dependencies Of course, the two kinds of dependencies
(symbols(symbols��instructions and instructions and
symbolssymbols��symbols) symbols) could be handled in the could be handled in the
first pass if proper design is adoptedfirst pass if proper design is adopted
MultiMulti--Pass AssemblersPass Assemblers
11 HALFSZHALFSZ EQUEQU MAXLEN/2MAXLEN/2
22 MAXLENMAXLEN EQUEQU BUFENDBUFEND--BUFFERBUFFER
33 PREVBTPREVBT EQUEQU BUFFERBUFFER--11
..
..
....
44 BUFFERBUFFER RESBRESB 40964096
55 BUFENDBUFEND EQUEQU **
•• SYMTAB could be revised to handle SYMTAB could be revised to handle forward references in symbol defining forward references in symbol defining instructionsinstructions
11 HALFSZHALFSZ EQUEQU MAXLEN/2MAXLEN/222 MAXLENMAXLEN EQUEQU BUFENDBUFEND--BUFFERBUFFER33 PREVBTPREVBT EQUEQU BUFFERBUFFER--11......44 BUFFERBUFFER RESBRESB 4096409655 BUFENDBUFEND EQUEQU **
11 HALFSZHALFSZ EQUEQU MAXLEN/2MAXLEN/222 MAXLENMAXLEN EQUEQU BUFENDBUFEND--BUFFERBUFFER33 PREVBTPREVBT EQUEQU BUFFERBUFFER--11......44 BUFFERBUFFER RESBRESB 4096409655 BUFENDBUFEND EQUEQU **
1034
11 HALFSZHALFSZ EQUEQU MAXLEN/2MAXLEN/222 ALPHAALPHA EQUEQU BUFENDBUFEND--BUFFERBUFFER33 PREVBTPREVBT EQUEQU BUFFERBUFFER--11......44 BUFFERBUFFER RESBRESB 4096409655 BUFENDBUFEND EQUEQU **
1034
2034