30
CH/S7CS/Nov., 2002 PROGRAMMING LANGUAGE Language Evaluation Criteria (For Reference Only) Readability The ease with which programs can be read and understood. A number of characteristics of programming languages contribute to their readability: i. Overall simplicity a) A language that has a large number of elementary components is usually more difficult to learn than one with a small number of elementary components. b) Another problem is feature multiplicity, i.e. having more than one way to accomplish a particular operation. E.g. In C, a user can increment a simple integer variable in four different ways, count = count +1 count++ ++count count += 1 c) A third problem is operator overloading, in which single operator symbol has more than one meaning E.g. Overloading the operator ‘+’ to mean simple integer or floating point addition, unary operation, the sum of all elements of two single-dimensional array and even vector addition. Rem. : Language statements can also be simplified too much and reduce readability, e.g. assembly language. ii. Orthogonality a) In a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively small number of ways to build the control and data structures of the language. b) Furthermore, every possible combination is legal and meaningful -- a symmetry of relationship among primitives. c) Example 1: Addition in the assembly languages of the IBM mainframe computers and the VAX series of super- minicomputer In IBM mainframe, PROGRAMMING LANGUAGE page 1

PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

  • Upload
    vodiep

  • View
    216

  • Download
    1

Embed Size (px)

Citation preview

Page 1: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

PROGRAMMING LANGUAGE

Language Evaluation Criteria (For Reference Only)

Readability The ease with which programs can be read and understood.

A number of characteristics of programming languages contribute to their readability:

i. Overall simplicitya) A language that has a large number of elementary components is usually more

difficult to learn than one with a small number of elementary components.

b) Another problem is feature multiplicity, i.e. having more than one way to accomplish a particular operation.

E.g. In C, a user can increment a simple integer variable in four different ways,count = count +1count++++countcount += 1

c) A third problem is operator overloading, in which single operator symbol has more than one meaning

E.g. Overloading the operator ‘+’ to mean simple integer or floating point addition, unary operation, the sum of all elements of two single-dimensional array and even vector addition.

Rem. : Language statements can also be simplified too much and reduce readability, e.g. assembly language.

ii. Orthogonalitya) In a programming language, it means that there is a relatively small set of

primitive constructs that can be combined in a relatively small number of ways to build the control and data structures of the language.

b) Furthermore, every possible combination is legal and meaningful -- a symmetry of relationship among primitives.

c) Example 1: Addition in the assembly languages of the IBM mainframe computers and the VAX series of super-minicomputer

In IBM mainframe,A Reg, memory cellAR Reg1, Reg2

Reg, Reg1 and Reg2 represent registers. The semantics of these areReg <--- contents(Reg) + contents(memory_cell)Reg1 <--- contents(Reg1) + contents(Reg2)

In VAX machine,ADDL operand_1, operand_2

whose semantic isoperand_2 <--- contents(operand_1) + contents(operand_2)

In this case, either operand can be a register or a memory cell.

PROGRAMMING LANGUAGE page 1

Page 2: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

The VAX instruction design is not orthogonal . There are two ways to specify operands, which can be combined in any way.

d) Example 2 : Pascal

Procedures can have both variable and value parameters.

Functions can return only unstructured types.

Formal parameter types must be named; they cannot be complete type descriptions.

Files and structured data cannot be passed by value.

Thus the type rules of Pascal are not orthogonal .

** Rem : The extreme form of orthogonality leads to unnecessary complexity.

iii. Control Statementsa) The structured programming revolution of the 1970s was a reaction to the poor

readability caused by the limited control structures of some of the languages of the 1950s and 1960s, e.g. BASIC and FORTRAN.

b) Early language lacks the control statements that allow strong restrictions on the use of GOTO, so writing highly readable programs in those languages was difficult.

iv. Data Structures

a) Boolean variable

Using integer for flag,Error = 1

which is ambiguous. Using Boolean,Error = true

which is better.

b) A record data type provides a more readable way to represent employee records than a parallel array scheme. (data abstraction)

v. Syntax Considerationa) Identifier forms

Restricting identifiers to very short forms detracts from readability. E.g. BASIC and FORTRAN.

The availability of connector characters, such as the underscore in identifiers is a great aid to readability.

b) Special words

Especially important is the method of forming compound statements, or statement groups, primarily in control constructs.

E.g. Pascal uses begin-end pairs and C uses braces ({}) for the same purpose. Both of these languages suffer because groups are always terminated in the same way, which makes it difficult to determine which group is being ended when an ‘end’ or ‘}’ is found.

FORTRAN - 77 and Ada make this clearer by using distinct closing syntax for each type of statement group, e.g. end if and end loop in Ada.

PROGRAMMING LANGUAGE page 2

Page 3: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

If the special words of a language can be used as names for program variables, the resulting programs can be very confusing.

c) Form and meaning

Designing statements so that their appearance at least partially indicates their action is an obvious aid to readability.

E.g. In FORTRAN,go to (10, 20, 30), I

means that the variable I is used to stores a numeric value, whilego to I, (10, 20, 30)

means that it stores an label value.

Writability The ease with which a language can be used to create programs for a chosen problem area.

Most of the language characteristics that affect readability also affect writability.

Writability must be considered in the context of the target problem domain of a language.

The most important factors influencing the writability of a language:

i. Simplicity and Orthogonalitya) A large number of different constructs may lead to a misuse of some features and

a disuse of others that may be either more elegant or more efficient, or both, than those that are used.

b) A smaller number of primitive constructs and a consistent set of rules for combining them (orthogonality) is much better than simply having a large number of primitive.

ii. Support for Abstractiona) Abstraction means that complicated structures or operations can be stated in

simple ways by ignoring many of the details.

b) Example 1, the use of a subprogram to implement a sort algorithm that is required several times in a program.

c) Example 2, data abstraction, e.g. binary tree.

In FORTRAN, three parallel integer arrays is used.

In Pascal, an abstraction of a tree node in the form of a single record unit with two pointers and an integer.

Reliability A desirable goal of programming language design is to allow and encourage reliable programs,

which will perform to its specifications under all conditions.

Several language features that affect the reliability:

i. Type checkinga) Type checking is the testing for type compatibility between two variables or a

variable and a constant that are somehow involved with one another.

b) E.g. two sides of an operator, parameter correspondence.

PROGRAMMING LANGUAGE page 3

Page 4: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

ii. Exception Handling. The ability of programs to interrupt run-time errors and other unusual conditions, to take corrective measures, and to continue is also a great aid to reliability. e.g. ON ERROR in BASIC

iii. Aliasinga) It is having two distinct referencing methods, or names for the same memory cell.

b) It is now widely accept that aliasing, without restriction, is too dangerous to justify its advantages.

iv. Readability and Writability. A program written in a language that does not support natural way to express the required algorithms will necessarily use unnatural methods.

Cost Types of costs

i. Cost of training programmers to use the language

ii. Cost of writing programs --> use high level language

iii. Cost of compiling programs

iv. Cost of executing programs. A language that requires many run-time type check, such as PL/1, will prohibit fast code execution.

v. Cost of maintaining programs <-- readability

There is a simple trade-off can be made between compilation cost and execution speed of the compiled code. The extra compilation effort results in much faster code execution.

** A final note on evaluation criteria: Most criteria, particularly readability and writability, are neither measurable nor scientifically defined.

Factors Influencing the Language Design (Reference only)

Computer Architecture The most popular languages have all been designed around the prevalent architecture, called the

von Neumann architecture.

von Neumann architecture

Both data and program are stored in the same memory.

The processor is a unit separate from the memory.

i. Instructions and data must be piped, or transmitted, from memory to the processor.

ii. Results of operations in the processor must be moved back to memory.

The von Neumann architecture causes the actual features of the imperative languages to be

i. variables , which model the memory cells.

ii. assignment statements , which are based on the piping operation; store and load.

iii. the iterative form of repetition , (the instructions in a von Neumann computer are stored in adjacent cells of memory.)

PROGRAMMING LANGUAGE page 4

Page 5: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

Programming Methodologies Software engineering:

i. the analysis of both the programming process and programming language design.

ii. under intense study since 1970s.

An important reason for the research in software engineering was the shift in the major cost of computing from hardware to software.

(From 80% hardware, 20% software; To 20 % hardware, 80 % hardware)

The primary programming language deficiencies that were discovered in the 1970s were incompleteness of type checking, inadequacy of control statements, and lack of facilities for exception handling.

E.g. Process-oriented design and the extensive efforts in the area of concurrency that are taking place in the 1980s are bringing with them the need for complete language facilities for creating and controlling concurrent program units.

Another example is the Object-oriented design (OOD) .

Object-oriented (OO) approach It emphasizes data design, concentrating on the use of logical, or abstract, data types to solve

problems.

For data abstraction to be used effectively in software system design, it should be supported by the languages used to write the system.

A Data base Application Example:

Button Dialog Box ScrollbarUser Interface Objects

ListBox Form Printer

Employee People ProductApplication Objects

Directors Liabilities

Databases TablesDatabase Objects

Fields Records

OO Terminology

Class (noun)i. A Class is a Type of entities which have common attributes and behaviour.

ii. Examples: Employee, Printer, etc.

Object or Instance (noun)i. An object is a particular entity which has attributes and behaviours as defined by a

Class.

PROGRAMMING LANGUAGE page 5

Page 6: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

ii. Examples: John, Peter, Printer-1, Printer-2, etc.

Attribute (noun or adj.)i. An attribute is a property of an object or class.

ii. Example: Employee::Name, Employee::HKID, Printer::ModelNumber, Printer::Weight, etc.

iii. An attribute can itself be an object or class.

iv. Examples: Employee::Product, etc.

Message, Eventi. Objects communicate with each other through message or event passing.

ii. Examples: Print, Save, Load, etc.

Methodi. Methods are the functions defined by the class or object.

ii. Some methods are used to handle events.

iii. Examples: Print, Save, Load, etc.

iv. Some methods are used to process information.

v. Examples: CalculateAsset, FormatPage, etc.

Employee FormatPage Form PrintPage Printer

Constructor VS Destructori. Constructor is a special method of a class which is executed once when the object is

first instantiated (created). A good place to put initialisation and setup code for the object.

ii. Destructor is a special method of a class which is executed once when the object is destroyed. A good place to release resources holding by the object.

OO methods The THREE OO programming methods: Inheritance, Encapsulation and Polymorphism.

Encapsulation

What?

i. It separate the interface from the implementation.

ii. It provide a well-define public interface to the user while hiding all internal complexities from them.

Why? A well-designed interface can save the user from having to know the complexities of the implementation in order to use your class/object.

How? In OO terms, an interface composes of methods and attributes visible to the user of your class.

An Employee class:

i. Identity the methods of an Employee from the context of the user, e.g. SignIn, SingOut, etc.

PROGRAMMING LANGUAGE page 6

Page 7: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

ii. Identity the attributes of an Employee from the context of the user, e.g. StaffID, Position, Department, Salary, etc.

Inheritance

What?

Inheritance specifies the relations between classes having similar properties.

Why?

i. It improves software reuse, eases software maintenance and eases software integration.

ii. OOD provides two class types, i.e. Base Class (Super-class) and Derive Class (Sub-class)

Class Inheritance.

i. A derive class inherits all methods and attributes of the base class.

ii. A derive class conceptually forms an “is-a-kind-of ” relationship with the base class.

Polymorphism

What?

A particular function (e.g. Area()) behaves differently according to the class that the object belongs to. This is true even the object is accessed indirectly through a reference of a base class type.

Why?

i. It simplifies programming by treating derive classes as base classes. (E.g. for pass-by-reference parameter passing in function calls.)

ii. It simplifies maintenance by not having to know the exact identity of the objects, making the code more general and extensible when new derive classes are created.

Abstraction. Allows a designer to ignore details and remain focused on the big picture. Start with a general system outline and progressively add more detail. (top-down approach)

Object Relationships

Is-a-kind-of relationship.i. Represented in OOD as inheritance, a derive class is-a-kind-of base class.

ii. E.g. Employee is-a-kind-of People, having all the methods and attributes of People.

PROGRAMMING LANGUAGE page 7

Base ClassPeople (for class Employee)

is-a-kind-of Derive Class(from class People)

Employee

is-a-kind-of is-a-kind-of

FullTime PartTimeEmployee Employee

Page 8: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

Consists-of Relationship.i. Occurs when an object is composed of other objects.

ii. Represented in OOD as attributes of a class.

iii. E.g. People consists-of a Name.

Uses Relationship.i. An object (client) uses another object (server) to accomplish some task.

ii. An object can be both a client and a server.

iii. E.g. A Form object uses a Printer object to print the form.

Contains Relationshipi. Occurs when an object acts as containers for other objects.

ii. The containment is dynamic and usually transient, objects can be added to or removed from the container object.

iii. Represented in OOD as a container class.

iv. E.g. Lists, Queues, Forms, etc.

Program Translation

Implementation methods The software that provides the high-level language interface to a computer can take several

different forms compilers, interpreters and impure interpreters.

High-level language program High-level language implementation Operating system

Bare machine(machine languageinterface

Figure 1The layered interfaces,or virtual computers,provided by a typical computer system

The software depends not only on the computer’s machine language, but also on a large collection of programs called the operating system that supplies higher-level primitives than those of the machine language.

Sample primitives: system resource management, input and output operations, a file management system, program editors, etc.

PROGRAMMING LANGUAGE page 8

Page 9: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

Compiler It goes through all the stages of translation and generates all the user source program codes into

machine codes before the program is being executed.

Linking may be necessary to connect the user code to the system programs.

The user and system code together was sometimes called a load module.

Pure Interpreter It allows easy implementation of many source-level debugging operations, because all run-time

error message can refer to the source-level units.

Source Sourceprogram program

Lexical Lexicalanalyzer analyzer

Lexical units Lexical units

Syntax Syntaxanalyzer analyzer

Parse trees Parse trees

Intermediate Intermediatecode code

generator generator

Intermediate Intermediatecode code

InputCode Interpreter data

Generator

Machine Computercode Input data

Computer

Results ResultsFigure 2 The compilation Figure 3 Impure interpretation

** von Neumann bottleneck

i. On a von Neumann architecture computers, programs resides in memory but are executed in the processor.

ii. Here’s the fetch-decode-execute cycle

repeat foreverfetch the next instructiondecode the instructionexecute the instruction

PROGRAMMING LANGUAGE page 9

Page 10: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

iii. The speed of the connection between a computer’s memory and its processor usually determines the speed of computer, because instructions often can be executed faster than they can be moved to the processor for execution. von Neumann bottleneck

Impure interpretation They translate high-level language programs to an intermediate language designed to allow

easy interpretation. It is faster than pure interpretation because the source language statements are decoded only once.

From fig. 3, there are three stages of compilation including lexical analysis, syntax analysis and code generation.

Lexical analysis It breaks up the input source codes to the compiler into chunks that are in a form suitable to be

analysed by the next stage of the compilation process.

The strings of characters representing the source program are broken up into small chunks, called token.

It is usual to remove all redundant parts of the source code (such as spaces and comments) during this tokenisation phase. It is also likely in many system that keywords such as END or PROCEDURE will be replaced by a more efficient, shorter token.

It is the job of the lexical analyser to check that all the keywords used are valid and to group certain symbols with their neighbours so that they can form larger units to be presented in the next stage of the compilation process.

A symbol table for programmer-defined identifiers would be created during lexical analysis and would contain details of attributes such as data types. As part of this standardized format, the tokens may be replaced by pointers to symbol tables.

Typically entries in the symbol table will show

i. the identifier or keyword;ii. the kind of item (variable, array, procedure, keyword, etc.);iii. the type of item (integer, real, char, etc.);iv. the run-time address of the item, or its value if it is a constant; andv. a pointer to accessing information (e.g. for an array, the bounds of the array, or for a

procedure, information about each of parameters).

PROGRAMMING LANGUAGE page 10

SourceProgram

Interpreter

Computer

Results

Figure 4 Pure interpretation

Page 11: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

Since the lexical analyser spends a great proportion of its time looking up the symbol table, the symbol table must be organised in such a way that entries can be found as quick as possible. Thus, binary search tree may be used.

Sample symbol table:

item name kind of item type of item run-time address or value pointer1 read keyword2 pi constant real 3.141593 radius variable real (?)4 begin keyword5 writeln keyword6 no_sides array integer (?) (?)..

Syntax Analysis It determines whether the string of input tokens form valid sentences.

At this stage the structure of the source program is analysed to see if it conforms to the context-free grammar for the particular language is being compiled.

This stage includes

i. finding out if the number of brackets is correct. (stack may be used, why?)

ii. determining the arithmetical operators used within an expression.

Complex forms may be broken down into simpler equivalents and more manageable form.

The primary formal methods of describing the syntax of programming languages are context-free grammars a formalism that is also known as Backus Naur form and syntax diagram.

The syntax of a program language the form of its expressions, statements and program units.

The semantic of a program language the meaning of those expression, statements and program units.

Backus-Naur Formi. It was presented by Backus in 1959 and Naur in 1960.

ii. The BNF is a metalanguage for program languages. A metalanguage is a language that is used to describe another language.

iii. It was abstractions for syntactic structures. A Pascal assignment statement, for example, might be represented by the abstraction <assign>. The actual definition of <assign> may be given by

<assign> ::= <var> := <expr>

iv. The text to the right of ‘::=’ is the definition of the symbol on the left side. The definition is called a rule, or production.

v. BNF is a generative tool for defining language. The sentences of the language are generated through repeated application of the rules, and such generation is called a derivation.

BNF example 1: A grammar for a small language

<program> ::= begin <stmt_list> end<stmt_list> ::= <stmt>

| <stmt> ; <stmt_list><stmt> ::= <var> := <expression><var> ::= A | B | C

PROGRAMMING LANGUAGE page 11

Page 12: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

<expression> ::= <var> + <var>| <var> - <var>| <var>

The above small language has only one statement form, assignment, of which the right hand side allows either a single variable, or two variables and either a + or - operator. The only allowable variable names are A, B and C. Here is a sample program:

beginA := B + C;B := C

end

A derivation of this program in this language follows:

<program> begin <stmt_list> end begin <stmt> ; <stmt_list> end begin <var> := <var> + <var> ; <stmt_list> end begin A := <var> + <var> ; <stmt_list> end begin A := B + <var> ; <stmt_list> end begin A := B + C; <stmt_list> end begin A := B + C; <stmt> end begin A := B + C; <var> := <var> end begin A := B + C; B := <var> end begin A := B + C; B := C end

vi. Example 2. A grammar for simple assignment statements.

<assign> ::= <id> := <expr><id> ::= A | B | C

<expr> ::= <id> + <expr>| <id> * <expr>| ( <expr> )| <id>

Remember one of the most attractive features of grammars is that they naturally describe the hierarchical syntactic structure of the sentences of the languages they defined. Such hierarchical structures are called parse tree.

Thus the statement:A:= B * (A + C)

can be generated by the derivation and form the corresponding parse tree.

<assign> ::= <id> := <expr>::= A := <expr>::= A := <id> * <expr>::= A := B * <expr>::= A := B * (<expr>)::= A := B*(<id> + <expr>)::= A := B * (A + <expr>)::= A := B * (A + <id>)::= A := B * (A + C)

However some grammar are ambiguous, e.g. the sentence

A := B + C * A

has two distinct parse trees as show in fig. 6<assign> <assign>

<id> := <expr> <id> := <expr>

A <expr> + <expr> A <expr> * <expr>

PROGRAMMING LANGUAGE page 12

<assign>

<id> := <expr>

A <id> * <expr>

B ( <expr> )

<id> + <expr>

A <id>

CFigure 5. A parse tree for a simple assignment.

Page 13: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

<id> <expr> * <expr> <expr> + <expr> <id>

B <id> <id> <id> <id> A

C A B C

<assign> ::= <id> := <expr><id> ::= A | B | C<expr> ::= <expr> + <expr>

| <expr> * <expr> | (<expr>) | <id>

Figure 6 Two distinct parse trees for the same grammar.

PROGRAMMING LANGUAGE page 13

Page 14: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

viii. Example 3 : An unambiguous grammar for expression

<assign> ::= <id> := <expr><id> ::= A | B | C<expr> ::= <expr> + <term>

| <term><term> ::= <term> * <factor>

| <factor><factor> ::= ( <expr> )

| <id>

Figure 7 The unique parse tree using an unambiguous grammar

<assign>

<id> := <expr>

A <expr> + <term>

<term> <term> * <factor>

<factor> <factor> <id>

<id> <id> A

B C

This grammar generates the same language as the BNF example 2, but it indicates the proper procedure order of multiply and add operators. A derivation of the sentence A := B + C * A will form a unique parse tree.

i. Assoicativity of Operators

a) The assignment A := B + C + A should form a parse tree as follows:

b) Thus B + C is calculated first rather than C + A. Such is called left associativity.

When a BNF rule has its LHS also appears the beginning of its RHS, the rule is said to be left recursive, which specifies left associativity.

<assign>

<id> := <expr>

A <expr> + <term>

<expr> + <term> <factor>

<term> <factor> <id>

<factor> <id> A

<id> C

B

c) When a BNF rule has its LHS also appears the end of its RHS, the rule is said to be right recursive, which specifies right associativity.

d) Rules such as

<factor> <exp> ** <factor>| <exp>

<exp> ( <expr> )| <id>

could be used to describe exponentiation as a right associative operator.

Extended Backus-Naur Form (EBNF)i. Three extension from BNF

a) [ ] Optional part of an RHS

e.g. if...then...else in Pascal:

<if_stmt> ::= if <logic_expr> then <stmt> [else <stmt>]

b) {} the part which can be repeated indefinitely

PROGRAMMING LANGUAGE page 14

Page 15: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

e.g. list of identifiers:

<ident_list> ::= <identifier> { , <identifier> }

c) ( ) A group from which a single element must be chosen.

e.g. for...do loop<for_stmt> ::= for <var> := <expr> to <expr> do

downto Alternately,

<for_stmt> ::= for <var> := <expr> ( to | downto ) <expr> do

Syntax Graph (Syntax Diagram)i. e.g. The syntax diagram describing Ada if statement is as follows:

if-stmt if condition then stmts

end-if ; else-if else stmts

else-if elsif condition then stmts

Figure 10: The syntax graph description of the Ada if statement.

ii. Two kinds of nodes:

a) Terminal symbols Circles and ellipses contain terminal symbols, which are lexemes in the language whose syntax is being described.

b) Non-terminal symbols Rectangles, each containing the name of a syntactic unit, or abstraction.

iii. Advantage: Easier to understand, by allowing us to visualise it.

Semantic Analysis (reference only)

There is no universal method of describing semantics.

Three methods: Operational, Axiomatic and Denotational.

Operational semanticsi. To use operational semantics to describe the semantics of a programming language

requires the construction of two components.

a) transfer to convert statement to a close low-level language for a virtual machine.

b) the virtual machine itself.

ii. e.g. Describing Pascal for...do loop

Pascal statement Operational semanticsfor I := first to last do I := first begin loop: if I > last goto out

. .

. . end I := I + 1

goto loopout: ...

PROGRAMMING LANGUAGE page 15

Page 16: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

iii. Evaluation. It provides an effective means of describing semantics for language users and language implementers, as long as the descriptions are kept as simple and informal as possible.

Axiomatic Semanticsi. It is based on the mathematical logic.

ii. Precondition A predicate, or an assertion, immediately before a statement describes the constraints on the program variable.

iii. Postcondition An assertion immediately following a statement describe the new constraints on those variables.

iv. e.g. if postcondition { sum > 11 } follows the statement sum := 2 * x + 1, then one possible precondition is { x > 10}. i.e. { x > 10 } sum := 2 * x + 1 { sum > 11 }

v. The weakest precondition is the least restriction that will guarantee the validity of the associated postcondition. For the above example, the preconditional {x > 10}, {x > 2000} and {x>15.5} are all valid, but weakest one should be {x > 5}.

vi. e.g.

a) The postcondition of the statement a := b/2 -1 is {a < 10}. The weakest precondition is {b<22}. Thus {b < 22} a := b/2 -1 {a < 10}.

b) In general, {PxE} x := E {P} where x E means substituting E for every occurrence of x in the postcondition.

c) There is a wp transformer function used as follows

wp( x := E, P) = PxE

vii. Sequence

If {P1} S1 {P2} and{P2} S2 {P3}

we get {P1} S1, S2 {P3}. If

S1 is x1 := E1 andS2 is x2 := E2

then we get{P3x2E2} x2 := E2 {P3}{(P3x2x2)x1E1} x1 := E1 {P3x2E2}

viii. For while loop

while y <> x do y := Y + 1 { y = x}For 0 iteration, the weakest precondition is { y = x }For 1 iteration, wp(y := y+1, {y = x}) = {y=x-1}For 2 iteration, wp(y := y+2, {y = x}) = {y=x-2}For 3 iteration, wp(y := y+3, {y = x}) = {y=x-3}

If the postcondition of the loop is loop termination. The weakest precondition is {y x }

ix. Evaluation

a) A powerful tool for research into program correctness proofs.

b) No general methods of creating the predicate transformers function, thus the usefulness is limited.

Denotational Semanticsi. It defines both a mathematical object for each language entity and a function that maps

instances of that entity onto instance of the mathematics object.

PROGRAMMING LANGUAGE page 16

Page 17: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

ii. e.g. BNF of binary number

<bin_num> 0| 1| <bin_num> 0| <bin_num> 1

iii. The semantics function N maps the abstract syntax to the objects in N is as follows:

N[[ 0 ]] = 0N[[ 1 ]] = 1N[[ <bin_num> 0 ]] = 2 * N[[ <bin_num> ]]N[[ <bin_num> 1 ]] = 2 * N[[ <bin_num> ]] + 1

iv. Evaluation .

a) In a similar but complex way objects and functions can be defined for the other syntactic entities of programming languages. This provides a framework for thinking in a highly rigorous way about programming, as well as a method of proving the correctness of programs.

b) It can be used as an aid to language design.

Attribute Grammars An attribute grammar is a grammar with the following additions:

i. Associated with each grammar symbol X is a set of attributes A(X). The set consists of two disjoint sets, synthesized attributes and inherited attributes.

ii. Associated with each grammar rule is a set of semantic functions and a possibly empty set of predicate functions over the attributes of the symbols in the grammar rule.

iii. For a rule X0 X1...Xn,

a) The synthesized attributes of X0 are computed with a semantic function of the form

S(X0) = f(A(X1), ... , A(Xn))meaning that their values depend only on the attribute values of their parent nodes.

b) The inherited attributes of Xj, 1 j n, are computed with a semantic function of the form

I(Xj) = f(A(X0))meaning that their values depend only on the attribute values of their parent nodes.

Intrinsic Attributes. They are synthesized attributes of leaf nodes, where values are determined outside the parse tree.

e.g. The data type of a variable in program could come from a table a symbol table.

Example: An attribute grammar for simple assignment statement.

1. Syntax rule: <assign> <var> := <expr>Semantic rule: <var>.env <assign>.env

<expr>.env <assign>.env<assign>.lhs_type <var>.actual_type<expr>.expected_type <assign>.lhs_type

PROGRAMMING LANGUAGE page 17

Page 18: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

2. Syntax rule: <expr> <var>[2] + <var>[3]Semantic rule: <var>[2].env <expr>.env

<var>[3].env <expr>.env<expr>.actual_type if (<var>[2].actual_type = int_type) and (<var>[3].actual_type = int_type)

then int_type else real_type

end ifPredicate: <expr>.actual_type = <expr>.expected_type

3. Syntax rule: <expr> <var>Semantic rule: <expr>.actual_type <var>.actual_type

<var>.env <expr>.envPredicate: <expr>.actual_type = <expr>.expected_type

4. Syntax rule: <var> A | B | CSemantic rule: <var>.actual_type look-up (RHS, <var>.env)

actual_type. It is associated with the terminals <var> and <expr>. It is used to stores either int_type or real_type. In case of a variable, the actual type is intrinsic.

expected_type. i. An inherited attribute associated with the non-terminal <expr>. It is used to stores either int_type or real_type.

ii. It is determined by the type of the variable on the left side of the assignment statement.

lhs_type. A synthesized attribute associated with <assign>. It is used to move the value of the synthesized actual)type of the LHS of an assignment statement to the inherited attribute expected for the <expr>.

env. An inherited attribute associated with the non-terminals <assign>, <expr> and <var>. It carries the reference to the correct symbol table entries to the instances of variables.

env

<assign>lhs_type

expected_type env <expr>

actual_type

env actual_type env actual_type env<var> <var> <var>

actual_type

A := A + B

Figure 11 The flow of attributes in the tree

PROGRAMMING LANGUAGE page 18

Page 19: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

Figure 12. A fully attributed env=table_1parse tree

<assign> env=table_1 lhs_type = real_type

env=table_1<expr> expected_type=real_type

actual_type = real_type

env=table_1 <var> <var> <var> env=table_1 actual_type = env=table_1 actual_type = real_type actual_type = int_type

real_type A := A + B

Evaluation.i. It provides a complete description of the syntax and static semantics of program

language; they have been used as the formal definition of language that can be input to a compiler generation system.

ii. Difficulties. Its size and complexity; a large parse tree which is costly to be evaluated.

Code Generation The code specific to the target machine is generated.

As the code is machine code then it is usual for several machine code instructions to be generated for each high level language instruction.

e.g. LET A = B + C in Basic.

In Code Generation,

i. remove the redundant word LET.

ii. search for the symbol table to see the locations A, B and C.

iii. generate the necessary machine code.

It should be reminded that parse trees may often be built before this phase, they can be used in the generation.

Routines from the system library may often have to be called up, e.g. write procedure of Pascal.

Optimisation. Often the code produced by such methods is not the best that could be obtained. It is possible to make more efficient machine code by carrying out a process which is called optimisation.

Reverse Polish notation (Postfix)** The reverse Polish notation is used to parse and represent arithmetic expressions in compiler.

Polish notation is also known as prefix notation because each operator precedes its operand.

A ‘Normal’ arithmetic expression is as follows(3+5)x(9-7)

This is called infix notation because all the operators are inside the expression. The Polish notation of it will be as follows

x+35-97

PROGRAMMING LANGUAGE page 19

Page 20: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

The Polish notation has the advantage that there can be no ambiguity in the way that an arithmetic expression can be worked out. It also needs no parentheses to separate the different parts.

Another notation is the reverse Polish (or postfix) notation which is very similar in principle and also forms a parentheses-free notation. However, this time reverse Polish notation is particularly suited to computerised methods because of the ability to deal with such expression easily by using a stack.

The reverse Polish notation of the above expression is as follows35+97-x

This leads to the following very simple rules for evaluating such expressions::

i. The next symbol encountered must be loaded on to the stack if it is an operand, i.e. a number or variable which is to be operated upon.

ii. If the next symbol to be encountered is an operator, i.e. +, /, -, etc. then carry out the required operation on the top two items in the stack. The result of this operation must be left on the top of the stack.

To convert an infix string of arithmetic expression to postfix one, a stack and a table of order of precedence should be used.

Assume the following rules of precedence are used:

Operator Precedence

() & ^ 3* & / 2+ & - 1= 0

PROGRAMMING LANGUAGE page 20

Page 21: PROGRAMMING LANGUAGE - Kwun Tong … · Web viewIn a programming language, it means that there is a relatively small set of primitive constructs that can be combined in a relatively

CH/S7CS/Nov., 2002

The algorithm of the conversion is shown in the following flowchart:

Start

Read symbolStop

Put on ( Otherstack Test Error report

Operand

Output to postfix stringRemove top of stack

Read next symbol (Look Empty

) at top ofTest stack

Operand

Output top of stack to postfix string

YesSpace Stack No Is (

Test empty on top of? stack?

Operation Yes No

Is Stop Output top of stackoperation of to postfix string

Yes higher precedence Nothan that on

stack or stackempty? Output top of stack

` to postfix string

Figure 13

To convert the expression V+W^X*Y/(Z-1):

Symbol being considered Output Postfix String Stack ( bottom)V V+ V +W VW +^ VW ^+X VWX ^+* VWX^ *+Y VWX^Y *+/ VWX^Y* /+( VWX^Y* (/+Z VWX^Y*Z (/+- VWX^Y*Z -(/+1 VWX^Y*Z1 -(/+) VWX^Y*Z1- /+

end of string VWX^Y*Z1-/+ stack empty

PROGRAMMING LANGUAGE page 21