ss module 1

8/14/2019 ss module 1

1/24

MODULE I MCA-303 SYSTEM SOFTWARE ADMN 2011- 1 2

Dept. of Computer Science And Applications, SJCET, Palai P a g e | 1

1.1 Review of assembly and machine language programming

1.1.1 Machine LanguageThis is a sequence of instructions written in the form of binary numbers consisting of 1's, O's towhich the computer responds directly. Machine language was initially referred to as code, althoughnow the term code is used more broadly to refer to any program text.

An instruction prepared in any machine language will have at least two parts. The first part is theCommand or Operation, which tells the computer what functions, is to be performed. All computershave an operation code for each of its functions. The second part of the instruction is the operand orit tells the computer where to find or store the data that has to be manipulated.

Just as hardware is classified into generations based on technology, computer languages also have ageneration classification based on the level of interaction with the machine. Machine language isconsidered to be the first generation language.

Advantage of Machine Language

It is faster in execution since the computer directly starts executing it.

Disadvantage of Machine Language

It is difficult to understand and develop a program using machine language. Anybody going throughthis program for checking will have a difficult task understanding what will be achieved when this program isexecuted. Nevertheless, the computer hardware recognizes only this type of instruction code.

8/14/2019 ss module 1

2/24


Dept. of Computer Science And Applications, SJCET, Palai P a g e |2

1.1.2 Assembly Language

When we employ symbols (letter, digits or special characters) for the operation part, the address part and

other parts of the instruction code, this representation is called an assembly language program. This isconsidered to be the second-generation language.

Machine and Assembly languages are referred to as low level languages since the coding for a problem is atthe individual instruction level. Each machine has got its own assembly language, which is dependent uponthe internal architecture of the processor.

An assembler is a translator which takes its input in the form of an assembly language program and produces

machine language code as its output.

The following program is an example of an assembly language program for adding two numbers X and Y andstoring the result in some memory location.

8/14/2019 ss module 1

3/24



From this program, it is clear that usage of mnemonics (in our example LD, ADD, HALT are themnemonics) has improved the readability of our program significantly.An assembly language

program cannot be executed by a machine directly as it is not in a binary form. An assembler isneeded in order to translate an assembly language program into the object code executable by themachine. This is illustrated in the figure

Assembler

8/14/2019 ss module 1

4/24



Advantage of Assembly Language

Assembly LanguageWhen we employ symbols (letter, digits or special characters) for the operation part, the address

part and other parts of the instruction code, this representation is called an assembly language program. This is considered to be the second- generation language.

Machine and Assembly languages are referred to as low level languages since the coding for a problem is at the individual instruction level. Each machine has got its own assembly language,which is dependent upon the internal architecture of the processor. An assembler is a translator whichtakes its input in the form of an assembly language program and produces machine language code asits output. The following program is an example of an assembly language program for adding twonumbers X and Y and storing the result in some memory location.

From this program, it is clear that usage of mnemonics (in our example LD, ADD, HALT are themnemonics) has improved the readability of our program significantly.An assembly language

program cannot be executed by a machine directly as it is not in a binary form. An assembler isneeded in order to translate an assembly language program into the object code executable by themachine. This is illustrated in the figure

Writing a program in assembly language is more convenient than in machine language. Instead of binary sequence, as in machine language, it is written in the form of symbolic instructions.Therefore, it gives a little more readability.

8/14/2019 ss module 1

5/24



Disadvantages of Assembly Language

Assembly language (program) is specific to particular machine architecture. Assembly languages aredesigned for specific make and model of a microprocessor. It means that assembly language

programs written for one processor will not work on a different processor if it is architecturallydifferent. That is why the assembly language program is not portable. Assembly language program isnot as fast as machine language. It has to be first translated into machine (binary) language code.

The time and cost of creating machine and assembly languages was quite high.

1.2 System software and Application software

Software is mainly classified into two . They are system software and Application software

1.2.1 System software

A system software is any computer software which manages and controls computer hardware so thatapplication software can perform a task. Operating systems, such as Microsoft Windows, Mac OS Xor Linux, areprominentexamplesofsystemsoftware.

System software performs tasks like transferring data from memory to disk, or rendering text onto adisplay device. Specific kinds of system software include loading programs, operating systems,device drivers, programming tools, compilers, assemblers, linkers, and utility software

System software is responsible for managing a variety of independent hardware components, so thatthey can work together harmoniously. Its purpose is to unburden the application software

programmer from the often complex details of the particular computer being used, including suchaccessories as communications devices, printers, device readers, displays and keyboards, and also to

partition the computer's resources such as memory and processor time in a safe and stable manner.

1.2.2 Application software

Application software consists of programs designed to perform specific tasks for users. Applicationsoftware can be used as a productivity/business tool; to assist with graphics and multimedia projects;to support home, personal, and educational activities; and to facilitate communications. Specificapplication software products, called software packages, are available from software vendors. As anexample word processing software.

There are two main categories of application programs: business programs and scientificapplication programs. Most programming languages are designed to be good for one category ofapplications but not necessarily for the other, although there are some general-purpose languages
http://en.wikipedia.org/wiki/Application_softwarehttp://en.wikipedia.org/wiki/Programmerhttp://en.wikipedia.org/wiki/Communications_devicehttp://en.wikipedia.org/wiki/Communications_devicehttp://en.wikipedia.org/wiki/Programmerhttp://en.wikipedia.org/wiki/Application_software

8/14/2019 ss module 1

6/24



that support both types. Business applications are characterized by processing of large inputs andlarge outputs, high volume data storage and retrieval but call for simple calculations. Languageswhich are suitable for business program development must support high volume input, output andstorage but do not need to support complex calculations. On the other hand, programming languagesthat are designed for writing scientific programs contain very powerful instructions for calculations

but rather poor instructions for input, output etc. Amongst traditionally used programminglanguages, COBOL (Commercial Business Oriented Programming Language) is more suitable for

business applications whereas FORTRAN (Formula Translation Language) is more suitable forscientific applications.

Major differences between system software and application software

1) a system software runs the system where an application software runs over the systemsoftware.2) a system software are programs that run & control the hardware units of the system & anapplication software doesn't.3) system programs are written using dll, exe files for windows & rpm(redhat package manager)files for linux etc, where application software are developed on the basis these files or by usingdifferent language files.4) you can't create applications using system software but application software are speciallymade to create applications for users.

1.3. Language Processors

1.3.1 Introduction

Language Processing activities arise due to the differences between the manner in which a softwaredesigner describes the ideas concerning the behaviour of a software and the manner in which theseideas are implemented in a computer system.The interpreter is a language translator. This leads to many similarities between are Translators andinterpreters. From a practical viewpoint many differences also exist between translators andinterpreters.

8/14/2019 ss module 1

7/24



The absence of a target program implies the absence of an output interface the interpreter. Thus thelanguage processing activities of an interpreter cannot be separated from its program executionactivities. Hence we say that an interpreter 'executes' a program written in a PL.

1.3.2 Problem Oriented and Procedure Oriented Languages:

The three consequences of the semantic gap mentioned at the start of this section are in fact theconsequences of a specification gap. Software systems are poor in quality and require large amountsof time and effort to develop due to difficulties in bridging the specification gap. A classical solutioninto develop a PL such that the PL domain is very close or identical to the application domain.

Such PLs can only be used for specific applications; hence they are called problem -orientedlanguages. They have large execution gaps, however this is acceptable because the gap is bridged bythe translator or interpreter and does not concern the software designer.

A procedure -oriented language provides general purpose facilities required in most applicationdomains. Such a language is independent of specific application domains. The fundamental language

processing activities can be divided into those that bridge the specification gap and those that bridgethe execution gap. We name these activities as

1. Program generation activities2. Program execution activities.

A program generation activity aims at automatic generation of a program. The source languagesspecification language of an application domain and the target language is typically a procedureoriented PL. A Program execution activity organizes the execution of a program written in a PL on

computer system. Its source language could be a procedure-oriented language or a problem orientedlanguage.

Program Generation

The program generator is a software system which accepts the specification of a program to begenerated, and generates program in the target PL. In effect, the program generator introduces a newdomain between the application and PL domains we call this the program generator domain. Thespecification gap is now the gap between the application domain and the program generator domain.This gap is smaller than the gap between the application domain and the target PL domain.Reduction in the specification gap increases the reliability of the generated program. Since the

generator domain is close to the application domain, it is easy for the designer or programmer towrite the specification of the program to be generated.

The harder task of bridging the gap to the PL domain is performed by the generator.This arrangement also reduces the testing effort. Proving the correctness of the programgenerator amounts to proving the correctness of the transformation .This would be performed while implementing the generator. To test an application generated byusing the generator, it is necessary to only verify the correctness of the specification input to the

program generator. This is a much simpler task than verifying correctness often generated program.

8/14/2019 ss module 1

8/24



This task can be further simplified by providing a good diagnostic (i.e. error indication) capability inthe program generator, which would detect inconsistencies in the specification.

It is more economical to develop a program generator than to develop a problem-oriented language.This is because a problem oriented language suffers a very large execution gap between the PL

domain and the execution domain whereas the program generator has a smaller semantic gap to thetarget PL domain, which is the domain of a standard procedure oriented language. The execution gap between the target PL domain and the execution domain is bridged by the compiler or interpreter forthe PL.

Program Execution

Two popular models for program execution are translation and interpretation.

Program translation

The program translation model bridges the execution gap by translating a program written in a PL,called the source program (SP), into an equivalent program in the machine or assembly language ofthe computer system, called the target program (TP)Characteristics of the program translation modelare:

A program must be translated before it can be executed. The translated program may be saved in a file. The saved program may be executed repeatedly.

A program must be retranslated following modifications.

Program interpretation

The interpreter reads the source program and stores it in its memory. During interpretation it takes asource statement, determines its meaning and performs actions which implement it.This includes computational and input-output actions.

The CPU uses a program counter (PC) to note the address of the next instruction to be executed.This instruction is subjected to the instruction execution cycle consisting of the following steps:

1. Fetch the instruction.2. Decode the instruction to determine the operation to be

performed, and also its operands.

3. Execute the instruction.

At the end of the cycle, the instruction address in PC is updated and the cycle is repeated for the nextinstruction. Program interpretation can proceed in an analogous manner. Thus, the PC can indicatewhich statement of the source program is to be interpreted next.This statement would be subjected to the interpretation cycle, which could consist of the followingsteps:

1. Fetch the statement

8/14/2019 ss module 1

9/24



2. Analyze the statement and determine its meaning, viz. the computation to be performed and itsoperands.3. Execute the meaning of the statement.From this analogy, we can identify the following characteristics of interpretation:The source program is retained in the source form itself, i.e. no target program form exists; A

statement is analyzed during its interpretation.

ComparisonA fixed cost (the translation overhead) is incurred in the use of the program translation model. If thesource program is modified, the translation cost must be incurred again irrespective of the size of themodification. However, execution of the target program is efficient since the target program is in themachine language. Use of the interpretation model does not incur thetranslation overheads. This is advantageous if a program is modified between executions, as in

program testing and debugging.

1.3.3 Language Processing Activities

Language Processing = Analysis of SP + Synthesis of TP.

Definition motivates a generic model of language processing activities. We refer to the collection oflanguage processor components engaged in analyzing a source program as the analysis phase of thelanguage processor. Components engaged in synthesizing a target program constitute the synthesis

phase.

A specification of the source language forms the basis of source program analysis. The specification

consists of three components:

1. Lexical rules , which govern the formation of valid lexical units in the source language.2. Syntax rules which govern the formation of valid statements in the source language.3. Semantic rules which associate meaning with valid statements of the language.

The analysis phase uses each component of the source language specification to determine relevantinformation concerning a statement in the source program. Thus, analysis of a source statementconsists of lexical, syntax and semantic analysis.

The synthesis phase is concerned with the construction of target language statements which have the

same meaning as a source statement.Typically, this consists of two main activities:

Creation of data structures in the target program Generation of target code.

We refer to these activities as memory allocation and code generation, respectively Lexical Analysis (Scanning)

8/14/2019 ss module 1

10/24

8/14/2019 ss module 1

11/24



Statement format

An assembly language statement has the following format:

[Label] [, ..]

where the notation [..] indicates that the enclosed specification is optional. If a label is specified in astatement, it is associated as a symbolic name with the memory word(s) generated for thestatement. has the following syntax:

[+][()]

Thus, some possible operand forms are: AREA, AREA+5, AREA(4), and AREA+5(4). Thefirst specification refers to the memory word with which the name AREA is associated.The second specification refers to the memory word 5 words away from the word with thename AREA. Here '5' is the displacement or offset from AREA. The third specificationimplies indexing with index register 4 that is, the operand address is obtained by adding

the contents of index register 4 to the address of AREA. The last specification is acombination of the previous two specifications.

1.4.1.1 Assembly Language Statements

An assembly program contains three kinds of statements:1. Imperative statements

2.Declaration statements

3.Assembler directives.

Imperative statements

An imperative statement indicates an action to be performed during theexecution of the assembled program. Each imperative statement typicallytranslates into one machine instruction.

Declaration statements

The syntax of declaration statements is as follows:

[Label] DS

[Label] DC ' '

The DS (short for declare storage) statement reserves areas of memory andassociates names with them. Consider the following DS statements:

A DS 1

G DS 200

The first statement reserves a memory area of 1 word and associates the name Awith it. The second statement reserves a block of 200 memory words. The nameG is associated with the first word of the block. Other words in the block can be

8/14/2019 ss module 1

12/24



accessed through offsets from G, e.g. G+5 is the sixth word of the memory block, etc.

The DC (short for declare constant) statement constructs memory wordscontaining constants. The statement

ONE DC ' 1'associates the name ONE with a memory word containing the value ' 1'. The

programmer can declare constants in di ffe rent forms decimal, binary,hexadecimal, etc. The assembler converts them to the appro priate internal form.

Use of constants

Contrary to the name 'declare constant', the DC statement does not reallyimplement constants, it merely initializes memory words to given values. Thesevalues are not protected by the assembler; they may be changed by moving anew value into the memory word. For example, in Fig. 4.3 the value of ONE can

be changed by executing an instruction MOVEM BREG, ONE.

An assembly program can use constants in the sense implemented in an HLLin two ways as immediate operands, and as literals. Immediate operands can

be used in an assembly statement only if the architecture of the target machineincludes the necessary features. In such a machine, the assembly statement

ADD AREG,5is translated into an instruction with two operands AREG and the value '5' as an

immediate operand. Note that our simple assembly language does not supportthis feature, whereas the assembly language of Intel 8086 supports it (seeSection 4.5).

ADD AREG, FIVE

ADD AREG, ='5.' => - -------

FIVE DC '5'

(a) (b)

Fig 1. Use of literals in an assembly program

A literal is an operand with the syntax =''. It differs from a constant because its location cannot be specified in the assembly program. This helps toensure that its value is not changed during execution of a program. It differsfrom an immediate operand because no architectural provision is needed tosupport its use) An assembler handles a literal by mapping its use into otherfeatures of the assembly language. Figure 4.4(a) shows use of a literal ='5'.Figure 1(b) shows an equivalent arrangement using a DC statement FIVE DC '51 . When the assembler encounters the use of a literal in the operand field of astatement, it handles the literal using an arrangement similar to that shown in

8/14/2019 ss module 1

13/24



Fig. 1(b) it allocates a memory word to contain the value of the literal, andreplaces the use of the literal in a statement by an operand expression referringto this word. The value of the literal is protected by the fact that the name andaddress of this word is not known to the assembly language programmer.

Assembler directivesAssembler directives instruct the assembler to perform certain actions during the

assembly of a program. Some assembler directives are described in thefollowing.

START This directive indicates that the first word of the target program generated bythe assembler should be placed in the memory word with address .

END []

This directive indicates the end of the source program. The optional indicates the address of the instruction where the execution ofthe program should begin. (By default, execution begins with the firstinstruction of the assembled program.)

1.4.1.2 Advantages of Assembly Language

The primary advantages of assembly language programming vis-a-vis machine language programming arise from the use of symbolic operand specifications. Figure 2 shows a changed program to compute N!/2, where rectangular boxes are used to highlight changes in the program.

One statement has been inserted before the PRINT statement to implement division by 2. In themachine language program, this leads to changes in addresses of constants and reserved memoryareas. Because of this, addresses used in most instructions of the program had to change. Suchchanges are not needed in the assembly program since operand specifications are symbolic in nature.

8/14/2019 ss module 1

14/24

MCA-303 SYSTEM SOFTWARE ADMN 2011- 14


Fig. 2

Design specification of an assemblerWe use a four step approach to develop a design specification for an assembler:

1. Identify the information necessary to perform a task.

2. Design a suitable data structure to record the information.3. Determine the processing necessary to obtain and maintain the-

information.4. Determine the processing necessary to perform the task.

The fundamental information requirements arise in the synthesis phase of an assembler. Hence it is best to begin by considering theinformation requirements of the synthesis tasks. We then consider howto make this information available, i.e. whether it should be collectedduring analysis or derived during synthesis.

Synthesis phaseConsider the assembly statement

START 101

READ N 101) + 09 0 114

MOVER BREG, ONE 102) + 04 2 116

MOVEM BREG, TERM 103) + 05 2 117

AGAIN MULT BREG, TERM 104) + 03 2 117MOVER CREG, TERM 105) + 04 3 117

ADD CREG, ONE 106) + 01 3 116

MOVEM CREG, TERM 107) + 05 3 117

CCJMP CREG, N 108) + 06 3 114

BC LE, AGAIN 109) + 07 2 104

DIV BREG, TWO 110) + 08 2 118

MOVEM BREG, RESULT 111) + 05 2 [115

PRINT RESULT 112) + 10 0 [TT5

STOP 113) + 00 0 000

N DS 1 114)

RESULT DS 1 115)

NE DC '1' 116) + 00 0 001TERM DS 1 117)

TWO DC END '2' 118) + 00 0 001

8/14/2019 ss module 1

15/24



MOVER BREG, ONE

/

in Fig. 4.3. We must have the following information to synthesize the machine in-struction corresponding to this statement:

1. Address of the memory word with which name ONE is associated,2. Machine operation code corresponding to the mnemonic MOVER.

The first item of information depends on the source program. Hence it must bemade available by the analysis phase. The second item of information does notdepend on the source program, it merely depends on the assembly language. Hencethe synthesis phase can determine this information for itself.Based on the above discussion, we consider the use of two data structures duringthe synthesis phase:

1. Symbol table2. Mnemonics table.

Each entry of the symbol table has two primary fields name and address. The table

is built by the analysis phase. An entry in the mnemonics table has two primaryfields mnemonic and opcode. The synthesis phase uses these tables to obtain themachine address with which a name is associated, and the machine opcode corre-sponding to a mnemonic, respectively. Hence the tables have to be searched withthe symbol name and the mnemonic as keys.

Analysis phase

The primary function performed by the analysis phase is the building of thesymbol table. For this purpose it must determine the addresses with which thesymbolic names used in a program are associated. It is possible to determine someaddresses directly7)e.g. the address of the first instruction in the program, howeverothers must be inferred. Consider the assembly program of Fig. 4.3. To determinethe address of N, we must fix the addresses of all program elements preceding it.This function is called memory allocation.

To implement memory allocation a data structure called location counter (LC)is introduced. The location counter is always made to contain the address of thenext memory word in the target program.It is initialized to the constant specified inthe START statement. Whenever the analysis phase sees a label in an assemblystatement, it enters the label and the contents of LC in a new entry of the symboltable. It then finds the number of memory words required by the assemblystatement and updates the LC contents.

(Hence the word 'counter' in "location counter'.) This ensures that LC points

to the next memory word in the target program even when machine instructionshave different lengths and DS/DC statements reserve different amounts of memory.To update the contents of LC, analysis phase needs to know lengths of differentinstructions. This information simply depends on the assembly language, hence themnemonics table can be extended to include this information in a new field called length. We refer to the processing involved in maintaining the location counter as

LC processing

8/14/2019 ss module 1

16/24



mnemonic opcode length

The tasks performed by the analysis and synthesis phase are as follows:

Analysis phase

1. Isolate the label, mnemonic opcode and operand fields of astatement.

2. If a label is present, enter the pair ( symbol, ) in a new entry of s ymbol table.

3. Check validity of the mnemonic opcode through a look-up in the Mnemonics table.

4. Perform LC processing, i.e. update the value containedin LC by considering the opcode and operands of thestatement.

Synthesis phase

1. Obtain the machine opcode corresponding to themnemonic from the Mnemonics table.

2.Obtain address of a memory operand from the Symboltable.

3.Synthesize a machine instruction or the machine form ofa constant, as the case may be.

8/14/2019 ss module 1

17/24



1.4.2 PASS STRUCTURE OF ASSEMBLERS

We have defined a pass of a language processor as one complete scan ofthe source program, or its equivalent representation .We discuss two

pass and single pass assembly schemes in th is section.Two pass translation

Two pass translation of an assembly language program can handleforward references easily\LC processing is performed in the first passand symbols defined in the program are entered into the symbol table.The second pass synthesizes the target form using the addressinformation found in the symbol table. In effect, the first pass performsanalysis of the source program while the second pass performs synthesisof the target program. The first pass constructs an intermediaterepresentation (IR) of the source program for use by the second pass(see Fig. 4.7). This representation consists of two main components data structures, e.g. the symbol table, and a processed form of thesource program. The latter component is called intermediate code (IC)

1.4.2.1 Single pass translation

LC processing and construction of the symbol table proceed as in two pass transla -tion. The problem of forward references is tackled using a process calledbackpatching. The operand field of an instruction containing a forward reference isleft blank initially. The address of the forward referenced symbol is put into thisfield when its definition is encountered. The instruction corresponding to the

statementMOVER BREG, ONE

can be only partially synthesized since ONE is a forward reference. Hence the in-struction opcode and address of BREG will be assembled to reside in location 101.The need for inserting the second op erand's address at a later stage can be indicated

by adding an entry to the Table of Incomplete Instructions (TII) . This entry is a pair (. ). e.g. (101. ONE) in this case.

8/14/2019 ss module 1

18/24



By the time the END statement is processed, the symbol table would contain theaddresses of all symbols defined in the source program and TII would contain in-formation describing all forward references. The assembler can now process eachentrv in TII to complete the concerned instruction. For example, the entry (101.ONE) would be processed by obtaining the address of ONE from symbol table andinserting it in the operand address field of the instruction with assembled address

101. Alternatively. entries in TII can be processed in an incremental manner. Thus,when definition of some symbol symbol is encountered, all forward references to symbol can be processed.

1.4.2.2 DESIGN OF A TWO PASS ASSEMBLER

Tasks performed by the passes of a two pass assembler are as follows:

Pass I1. Separate the symbol, mnemonic opcode and operand fields.2.Build the symbol table.3. Perform LC processing.4.Construct intermediate representation.

Pass II Synthesize the target program .

Pass I performs analysis of the source program and synthesis of the intermediaterepresentation while Pass II processes the intermediate representation to synthesizethe target program.

Pass I of the Assembler Pass I comprises the following data structures:

OPTAB A table of mnemonic opcodes and related informationSYMTAB Symbol tableLITTAB A table of literals used in the program

Figure 4.9 illustrates sample contents of these tables while processing the programof Fig. 4.8. OPTAB contains the fields mnemonic opcode, class and mnemonicinfo. The class field indicates whether the opcode corresponds to an imperativestatement (IS), a declaration statement (DL) or an assembler directive (AD). If animperative, the mnemonic info field contains the pair (machine opcode,instruction length). else it contains the id of a routine to handle the declaration ordirective statement. A SYMTAB entry contains the fields address and length. ALITTAB entry contains the lields l i teral and address.

8/14/2019 ss module 1

19/24



Processing of an assembly statement begins with the processing of its label field.

If it contains a symbol, the symbol and the value in LC is copied into a new entryof SYMTAB. Thereafter, the functioning of Pass I centers around theinterpretation of the OPTAB entry for the mnemonic. The class field of the entryis examined to determine whether the mnemonic belongs to the class ofimperative, declaration or assembler directive statements. In the case of animperative statement, the length of the machine in struction is simply added to theLC. The length is also entered in the SYMTAB entry of the symbol (if any)defined in the statement. This completes the processing of the statement.

The use of L1TTAB needs some explanation. The first pass uses L1TTAB to col-lect all literals used in a program. Awareness of different literal pools ismaintained using the auxiliary table POOLTAB. This table contains the literal

number of the starting literal of each literal pool. At any stage, the current literal pool is the last pool in L1TTAB. On encountering an LTORG statement (or theEND statement), literals in the current pool are allocated addresses starting withthe current value in LC and LC is appropriately incremented.

8/14/2019 ss module 1

20/24



8/14/2019 ss module 1

21/24



8/14/2019 ss module 1

22/24



8/14/2019 ss module 1

23/24



8/14/2019 ss module 1

24/24


Documents

ss module 1