Manual on Programming Languages

7/31/2019 Manual on Programming Languages

1/36

Programming Languages

UNIVERSITY OF THE CORDILLERAS

1

PRELIMINARIES

1.1 Reasons for Studying Concepts of Programming Languages Increased capacity to express ideas.

- It is difficult for people to conceptualize structures that they cannot describe,verbally or in writing.- Awareness of a wider variety of programming language features can reduce

limitations in software development.

- Programmers can increase the range of their software-development thoughtprocesses by learning new language constructs.

- Builds an appreciation for valuable language features and encouragesprogrammers to use these features.

Improved background for choosing appropriate languages.- Many professional programmers have had little formal education in computer

science and were trained on the job or through in-house training programs.

- Many other programmers received their formal training in the early days ofcomputer science education, when few languages were not widely known.

- The result of this narrow background is that many programmers, when givena choice of languages for a new project, continue to use the language withwhich they are most familiar, even if it is poorly suited to the new project.

- If these programmers were familiar with the other languages, they would bein a better position to make informed language choices.

Increased ability to learn new languages.- Computer programming is a young discipline, and design methodologies,

software development tools, and programming languages are still in a state of

continuous evolution.- The process of learning a new programming language can be lengthy and

difficult, especially for someone who is comfortable with only one or twolanguages and has never examined programming language concepts in

general.- Once a thorough understanding of the fundamental concepts of languages is

acquired, it becomes far easier to see how these concepts are incorporated

into the design of the language being learned.

- It is essential that practicing programmers know the vocabulary andfundamental concepts of programming languages so they can read andunderstand programming language manuals and sales literature for languages

and compilers.

Better understanding of the significance of implementation.- Allows us to visualize how a computer executes various language constructs.

Understand relative efficiency of alternative constructs that may be chosen for

a program.

- This in turn leads to the ability to use a language more intelligently, as it wasdesigned to be used.

- We can become better programmers by understanding the choices amongprogramming language constructs and the consequences of those choices.


2/36



2

- Certain kinds of program bugs can only be found and fixed by a programmerwho knows some related implementation details.

Increased ability to design new languages- To a student, the possibility of being required at some future time to design a

new programming language may seem remote.- However, most professional programmers occasionally do design languages of

one sort or another.

Overall advancement of computing.- Finally, there is a global view of computing that can justify the study of

programming language concepts.- Although it is usually possible to determine why a particular programming

language became popular, it is not always clear, at least in retrospect, thatthe most popular languages are the best available.

- In some cases, it might be concluded that a language become widely used, atleast in part, because those in positions to choose languages were not

sufficiently familiar with programming language concepts.- In general, if those who choose languages are better informed, better

languages will more quickly squeeze out poorer ones.

1.2 Programming Domains Scientific Applications

- Typically, scientific applications have simple data structures but require largenumbers of floating-point arithmetic computations.

- For some scientific applications where efficiency is the primary concern, likethose that were common in the 1950s and 1960s, no subsequent language issignificantly better than FORTRAN.

Business Applications- The use of computers for business applications began in the 1950s.- The first successful high-level language for business was COBOL which

appeared in 1960.

- Business languages are characterized, according to the needs of theapplication, by elaborate input and output facilities and decimal data types.

- With the advent of microcomputers came new ways for businesses, especiallysmall businesses, to use computers. Two specific tools, spreadsheet systemsand database systems, were developed for business and now are widely used.

Artificial Intelligence- AI is a broad area of computer applications characterized by the absence of

exact algorithms and the use of symbolic computations rather than numeric

computation.- Symbolic computation means that symbols, consisting of names rather than

numbers, are manipulated.


3/36



3

- The first widely used programming language developed for AI applicationswas the functional language LISP (Scheme) which appeared in 1959.

- An alternative approach to these applications appeared in the early 1970s:logic programming using Prolog language.

Systems Programming Languages- The operating system and all of the programming support tools of a computer

system are collectively known as its systems software.- Systems software is used almost continuously and therefore must have

execution efficiency.- A language for this domain must have low-level features that allow the

software to external devices to be written.

- In the 1960s and 1970s, some computer manufacturers, such as IBM,Digital, and Burroughs (now UNISYS) developed special machine-orientedhigh-level languages for systems software on their machines. For IBMmainframe computers, the language was PL/S, a dialect of PL/I; for Digital, it

was BLISS, a language at a level just above assembly language; for

Burrougs, it was Extended ALGOL.

- The UNIX operating system is written almost entirely in C, which was made itrelatively easy to port, or move, to different machines.

Very High-Level Languages (VHLLs)- The languages in the category called very high-level have evolved slowly over

the past 25 years.- The various scripting languages for UNIX are examples of VHLLs. A scripting

language is one that is used by putting a list of commands, called a script, in

a file to be executed.- The first of these languages, named shell, began as a small collection of

commands that were interpreted to be calls to system subprograms that

performed utility functions, such as file management and simple file filtering.- Other VHLLs are awk, for report generation; tcl combined with tk, which

provide a method for building X Window applications. The perl is acombination ofshell and awk.

Special-Purpose Languages- A host of special-purpose languages have appeared over the past 40 years.- They range from RPG, which is used to produce business reports, to APT,

which is used for instructing programmable machine tools, to GPSS, which isused for systems simulation.


4/36


5/36



5

Where Reg1 and Reg2 are registers. The semantics are:

Reg1 contents(Reg1) + contents(memory_cell)

Reg1 contents(Reg1) + contents(Reg2)

VAX superminicomputers (orthogonal)ADDL operand_1, operand_2

Where the semantics is:Operand_2 contents(operand_1) + contents(operand_2)

ALGOL 60 too orthogonal

ex. Record + array (no restrictions on the types)

if A == B = x+9 (condition, assignment & arithmetic)

Control Statements- Facilities to transfer control of the program execution from one

program part to another.- Indiscriminate use of goto statements severely reduces program

readability.

Remedy when using goto:

1. They must precede their targets, except when used for loops2. Their targets must never be too distant3. Their numbers must be limited

- Control statement deign of a language can be important factor in thereadability of programs written in that language.

Data Types and Structures- The presence of adequate facilities for defining data types and data

structures in a language is another significant aid to readability.Ex. Sum_is_too_big := 1 (no Boolean types)

Sum_is_too_big := true

Record data type vs. collection of similar arrays

Syntax Considerations- The syntax, or form, of the elements of a language has a significant effect

on the readability of programs. The following are three examples of

syntactic design choices that affect readability:

1. Identifier forms- length of the identifier (names)- case sensitivity

- presence of connectors2. Special words

- delimiters

- short-circuit evaluation

3. Form and meaning- the meaning has to agree/follow the form or syntax.

Writability- A measure of how easily a language can be used to create programs for a

chosen problem domain.- Most of the language characteristics that affect readability also affect

writability.


6/36



6

Simplicity and Orthogonality- a smaller number of primitive constructs and a consistent set of rules for

combining them (that is, orthogonality) is much better than simply havinga large number of primitives.

- A programmer can design solution to a complex problem after learningonly a simple set of primitive constructs.

- BUT too much orthogonality can be a detriment to writability. Errors inwriting programs can go undetected when nearly any combination ofprimitives is legal. Leading to misuse or disuse.

Support for Abstractions- Hiding the details of implementation- The degree of abstraction allowed by a programming language and the

naturalness of its expression are very important to its writability.- Programming languages can support two distinct categories of abstraction

1. Data AbstractionEx. Binary tree

2. Process AbstractionEx. subprograms

Expressivity- There are very powerful operators that allow great deal of computation to

be accomplished with a very small program.- A language has relatively convenient, rather than cumbersome, ways of

specifying computations.

Reliability A program is said to be reliable if it performs to its specifications under all

conditions.1. Type Checking

- Testing for type errors in a given program, either by the compiler orduring program execution.Ex. Type compatibility b/n 2 variables.

- The earlier errors in programs are detected, the less expensive it is tomake the required repairs.

- Consider space, time and accuracy2. Exception Handling

- The ability of a program to intercept run-time errors (as well as otherunusual conditions detected by the program), take correctivemeasures, and then continue execution.

3. Aliasing- having two distinct referencing methods, or names, for the same

memory cell.- It is now widely accepted that aliasing is a dangerous feature of

programming language.

4. Readability and Writability- The easier a program is to write, the more l ikely it is to be correct.- Programs that are difficult to read are difficult both to write and to

modify.


7/36



7

Cost Cost of training programmers to use the language Cost of writing programs in the language Both costs of training and writing programs Cost of compiling programs Cost of executing programs Cost of compilers Cost of poor reliability Cost of maintaining programs Optimization is the name given to collection of methods that compilers may

use to decrease the size and/or increase the execution speed.

Criteria for evaluation:1. Portability

- ease with which programs can be moved from one implementation toanother. (Standardization)

2. Generality- the applicability to a wide range of applications.

3. Well-definedness- the completeness and precision of the languages official defining

document

1.4 Influences on Language Design Computer Architecture

- The basic architecture of computers has a crucial effect on the languagedesign. Most of the popular languages of the past 35 years have been

designed around the prevalent computer architecture, called the VonNeumann architecture. These languages are called imperative languages.In a von Neumann computer, both data and programs are stored in the same

memory. The CPU executes instructions, is separated from the memory.- Central features of imperative language are:

1. Variables which models the memory cells2. Assignment statements piping operations3. Iteration construct repetition

Programming Methodologies Data-Oriented

- Simply put, data-oriented methods emphasize data design, concentratingon the use of logical, or abstract, data types to solve problems.

- Objected-oriented methodology begins with data abstraction, whichencapsulates processing with data objects and hides access to data, and

adds inheritance and dynamic type bindings. Inheritance is a powerfulconcept that greatly enhances the possibility of reuse of existing software.

Reuse of software components promises to significantly increase softwaredevelopment productivity.

Process-Oriented- Opposite of data-oriented programming.- Focuses on concurrency


8/36



8

1.5 Language Design Trade-offs The task of choosing constructs and features when designing a programming

language involves a collection of compromises and trade-offs

-Conflicting criteria:1. Reliability vs. Cost of execution2. Writability vs. Readability3. Flexibility vs. Safety

1.6 Implementation Methods Compilation translates programs from some high-level instructions to machine

language, which can be executed directly on the computer.

Lexical units

Parse trees

Intermediate code

Machine language

Input Data

Results

SourceProgram

Lexicalanalyzer

Syntaxanalyzer

IntermediateCode generator(and semantic

analyzer)

Codegenerator

Computer

Symboltable

Optimization(optional)


9/36



9

Pure Interpreter programs are interpreted by another program (called theinterpreter) without going through any form of translation.

Input data

Results

Hybrid Implementation Systems translates high-level language programs toan intermediate language designed to allow easy interpretation.

Lexical units

Parse trees

Intermediate code

Input Data

Results

SourceProgram

Interpreter

SourceProgram

Lexicalanalyzer

Syntaxanalyzer

IntermediateCode generator

Interpreter


10/36



10

1.7 Programming Environments The collection of tools used in the development of software

1. File system secondary memory2. Text editor w/ debugger, optimizer3. Linker preliminary step in the completion of the result4. Compiler large collection of integrated tools

1.8 Programming Paradigms A Programming Paradigm is a problem-solving approach

FORTRAN Experimental Applied to Spreadsheet PackagesALGOL PROLOG queryCOBOL languagesBASIC 4GLsPascalCModula-2

ADA

LISP APL SNOBOL PROLOG VBLogo PERL Level 5 C++Scheme Java

Programming Language

Process Oriented Data Oriented

Imperative Data Flow Functional Constraint Rule Object Database

ListProcessing

ArrayProcessing

StringProcessing

ProductionSystem

Logic AccessOriented

ObjectOriented


11/36



11

NAMES, BINDINGS, TYPE CHECKING, ANS SCOPES

2.1 Names Associated with variables, labels, subprograms and formal parameters Design Issues:

1. What is the maximum length of a name?2. Can connector characters be used in names?3. Are names case sensitive?4. Are the special words reserved words or keywords?

Special Words- Used to make programs more readable- Used to separate the syntactic entities of programs- Keyword is a word in PL that is special only in certain context

Ex. PASCAL

var true:integer;flag:boolean;

beginflag := true;

true := 1;end;

- Reserved word a special word that cannot be used as a nameEx. C

int float=2;

/*you cannot use void as a variable name since it is a reserved word

that signifies a data type */- Predefined names names that have predefined meaning but can be

redefined by the user. Must be visible to the compiler when used.

Ex. C

clrscr();

PASCAL

writeln(); readln();

2.2 Variables An abstraction of the computer memory cell or collection of cells. A variable can be characterized as a sextuple of attributes:

1. Name as discussed in 1.1. often referred to as identifiers.2. Address is the memory address with which it is associated3. Value is the contents of the memory cell or cells associated with it.4. Type determines the range of values the variable can have and set of

operations that are defined for values of the type.

5. Lifetime is the time during which the variable is bound to a specific memorylocation6. Scope The scope of such a variable is from its declaration to the end

reserved word of the procedure.


12/36



12

2.3 The Concept of Binding Binding is an association, such as between an attribute and an entity or between

an operation and a symbol.

Binding Time is the time at which a binding takes place. Bindings can take place at language design time, language implementation time,

compile time, link time, load time, or run time.

Ex. Cint count;. . .count = count +5;

Some of the bindings and their binding times for the parts of this assignment

statement are as follows:

Set of possible types for count: bound at language design time. Type of count: bound at compile time. Set of possible values of count: bound at compiler design time. Value of count: bound at execution time with this statement. Set of possible meanings for the operator symbol +: bound at language

definition time Meaning of the operator symbol +: bound at compile time. Internal representation of the literal 5: bound at compiler design time.

Binding of Attributes to Variables- A binding is static if it occurs before run time and remains unchanged

throughout program execution.- A binding is dynamic if it occurs during run time or can change in the course

of program execution.

Type Bindings- Before a variable can be referenced in a program, it must be bound to a data

type.- The two importance aspects of this binding are how the type is specified and

when the binding takes place.- Types can be specified statically through some form of explicit or implicit

declaration.

Explicit declaration is a statement in a program that lists variable namesand declares them to be of a particular type.

Implicit declaration is a means of associating variables with typesthrough default conventions instead of declaration statements.

Dynamic Type Binding- The type is not specified by a declaration statement.- The variable is bound to a type when it is assigned a value in an assignmentstatement.- When the assignment statement is executed, the variable being assigned is

bound to the type of the value, variable or expression on the right side of theassignment.

- The primary advantage of dynamic binding of variables to types is that itprovides a great deal of programming flexibility.


13/36



13

Ex. SNOBOL4

LIST 10.2 5.1 0.0 (causes LIST to be 1 dimensional array)

LIST 47 (causes LIST to be integer variable)

- Are often implemented using interpreters rather than compilers. This ispartially because it is difficult to change dynamically the types of variables in

machine code.

- There are two disadvantage of dynamic type binding.1. Error detection capability of the computer is diminished relative to a

compiler for a language with static type bindings, because any two types

can appear on opposite sides of the assignment operator.2. The cost of implementing dynamic attribute binding is considerable,

particularly in execution time. Type checking must be done at run time.

Furthermore, every variable must have a descriptor associated with it tomaintain the current type. The descriptors must also be of varying sizebecause more space is needed if the variable is a structured type than if it

is a primitive type.

Type Inference- Inferencing mechanism, in which the types of most expressions can be

determined without requiring the programmer to specify the types of thevariables.Ex. fun circumf (r) = 3.14159 * r * r;

- a function that takes real argument and produces real result.

fun times10 (x) = 10 * x;

- argument and functional value are inferred to be type integer.

fun square (x) = x * x;

- cannot be inferred. Instead explicitly define as:

fun square (x) : int = x * x;fun square (x : int) = x * x;

fun square (x) = (x : int) * x;

fun square (x) = x * (x : int);

- Type inference is also used in the purely functional language. Storage Bindings and Lifetime

- Allocation is the process where memory cell to which a variable is boundmust be somehow taken from a pool of available memory.

- Deallocation is the process of placing a memory cell that has been unboundfrom a variable back into the pool of available memory.

- Lifetime of a program variable is the time during which the variable is boundto a specific memory location. So the lifetime of a variable begins when it isbound to a specific cell and ends when it is unbound from that cell.- Static Variables

- Those that are bound to memory cells before program executionbegins and remain bound to those same memory cells until programexecution terminates.

- The greatest advantage of static variable is efficiency. All addressing ofstatic variables can be direct.


14/36



14

- No run-time overhead is incurred for allocation and deallocation.- Disadvantage of static binding to storage is reduced flexibility; in

particular, in a language that has only variables that are staticallybound to storage, recursive subprograms are not supported.

- Static-Dynamic Variables- Those whose storage bindings are created when their declaration

statements are elaborated, but whose types are statically bound.- Elaboration of such a declaration refers to the storage allocation and

binding process indicated by the declaration, which takes place whenexecution reaches the code to which the declaration is attached.

- Elaboration occurs during run time.- In C, local variables are by default stack-dynamic but can be made

static by including the static qualifier to their definitions.

- Explicit Heap-Dynamic Variables- Are nameless objects whose storage is allocated and deallocated by

explicit run-time instructions specified by the programmer. Thesevariables, which are allocated from and deallocated to the heap, can

only be referenced through pointer variables.- An example using C++ code segment

int *intnode;. . .

intnode = new int; // allocates an int object. . .delete intnode; // deallocates object to w/c intnode points

- Explicit heap-dynamic variables are often used for dynamic structures,such as linked lists and trees, that need to grow and shrink duringexecution.

- The disadvantages of such variables are the difficulty of using themcorrectly and the cost of references, allocations and deallocations.

- Implicit Dynamic Variables- Are bound to heap storage only when they are assigned values. In

fact, all their attributes are bound every time they are assigned.- The advantage of such variables is they have the highest degree of

flexibility, allowing highly generic code to be written.- The disadvantage is the run-time overhead of maintaining all the

dynamic attributes, which could include array subscript types andranges, among others. Another is the loss of some error detection bycompiler.

2.4 Type Checking- Is the activity of ensuring that the operands of an operator are of compatible

types.

- A compatible type is one that is either legal for the operator or is allowedunder language rules to be implicitly converted by compiler-generated code to

a legal type.- This automatic conversion is called coercion.- A type error is the application of an operator to an operand of an

inappropriate type.- If all bindings of variables to types are static in a language, then type

checking can nearly always be done statically.


15/36



15

- Dynamic type binding requires type checking at run time, which is calleddynamic type checking.

2.5 Strong Typing- A strongly typed language is one in which each name in a program in thelanguage has a single type associated with it, and that type is known at

compile time.- All types are statically bound.- A PL is said to be strongly typed if type errors are always detected.- The importance of strong typing lies in its ability to detect all misuses of

variables that result in type errors.

- A strongly typed language also allows detection, at run-time, of uses of theincorrect type values in variables that can store values of more that one type.

2.6 Type Compatibility- Name type compatibility means that two variables have compatible types

only if they are either the same declaration or in declarations that use the

same type name.- Structure type compatibility means that two variables have compatible

types if their types have identical structures.

2.7 Scope- The scope of a program variable is the range of statements in which the

variable is visible.- A variable is visible in a statement if it can be referenced in that statement.- A variable is Local in a program unit or block if it is declared there.- The Nonlocal variables of a program unit or block are those that are visible

within the program unit or block but are not declared there.

Static Scope- Scope of a variable can be statically determined, that is, prior to execution- See example in page 172

Blocks- Section of code

Dynamic Scope- Is based on the calling sequence of subprograms, not on their spatial

relationship to each other.- The scope can be determined only at run time.- See example in Page 177.

Evaluation of Dynamic Scoping- The correct attributes of non-local variables visible to a program statement

cannot be determined statically.- Several kinds of programming problems follow directly from dynamic

scooping.- Dynamic scooping results in less reliable programs than static scooping- Inability to statically type check references to non-locals.


16/36



16

- Dynamic scooping also makes programs much more difficult to read, becausethe calling sequence of subprograms must be known to determine the

meaning of references to non-local variables.- On the other hand, dynamic scooping can be used to advantage in

programming. Subprograms inherit the context of their callers

2.8 Scope and Lifetime- Relation: The scope of a variable is from its declaration to the end reserved

word of the procedure. The lifetime of that variable is the period of time

beginning when the procedure is entered and ending when execution of theprocedure reaches the end.

2.9 Referencing Environments- The referencing environment of a statement is the collection of all names that

are visible in the statement.

- The referencing environment of a statement in a static-scoped language is thevariables declared in its local scope plus the collection of all variables of its

ancestors scopes that are visible. (see pages 180-181 for examples.)

2.10 Named Constants- A named constant is a variable that is bound to a value only at the time it is

bound to storage; its value cannot be changed by assignment or by an input

statement.- Named constants are useful as aids to readability and program reliability.- Readability can be improved, for example, by using the name pi instead of

the constant 3.14159.

2.11 Variable Initialization- It is convenient for variables to have values before the code of the program

or subprogram in which they are declared begins executing.- The binding of a variable to a value at the time it is bound to storage is called

initialization.

- If the variable is statically bound to storage, binding and initialization occurbefore run time.

- If the storage binding is dynamic, initialization is also dynamic.


17/36



17

DATA TYPES

3.1 Introduction

Computer programs produce results by manipulating data. An important factor in

determining the ease with which they can perform this task is how well the datatypes match the real-world problem space. It is crucial, therefore, that a language

supports the proper variety of data types and structures.

3.2 Primitive Data Types

The data types that are not defined in terms of other types are called primitive data

types. Nearly all-programming languages provide set of primitive data types.

The primitive data types of a language are used, along with one or more type

constructors, to provide the structured types.

a. Numeric Types Integer

- The most common primitive data type- Represented in a computer by a string of bits with one of the bits

representing the sign.- Implementations:

Sign bit Binary Integer

Type Sign bit Binary integerDescriptor

Type Sign Binary

Descriptor bit integer

Floating Point- Model real numbers but the representation are only

approximation for most real numbers

- Have value ranges that are defined in terms of precision andrange. (Ex. , e)

- Problem: Loss of accuracy through arithmetic operations- Implementations:

Single precision8 bits 23 bits

I

I

ex onent fractionsb


18/36



18

Double precision11 bits 52 bits

Decimal- Store a fixed number of decimal digits, with the decimal point at

a fixed position in the value.- Uses the binary coded decimal (BCD) representation- Ex. PL/I:

DECLARE X FIXED DECIMAL(10,3)

COBOL:X PICTURE 999V99

Boolean Types- Has only 2 elements / range (true or false)- Often used for switches or flags- Could be represented by a single bit but the smallest addressable

unit is normally used

Character Types- Stored as numeric codings- Uses ASCII representation- JAVA uses the UNICODE representation

3.3 Structured Data Types

Character String Types-

One in which the object consist of sequences of characters- Design issues:

Should string be primitive type or simply a special kind ofcharacter array?

Should string have static or dynamic length?- String Length Options:

Static length stringEx:

A:String[20] (Pascal)Character (len=15) Name1, Name2 (Fortran)

Implementation:

- Require compile-time descriptor with field for length

exponent fractionsb

Static string

Length

Address


19/36



19

Limited dynamic lengtho Allow string to have varying length up to a

declared and fixed maximum set by the variable

definition.o Ex: char A[20];o Implementation:

Requires runtime descriptor to store boththe fixed maximum length and currentlength

Dynamic length stringo String have varying length with no maximumo Provides maximum flexibilityo Ex: Snobol4

Newline = trim(input)

o Implementation: Require a simpler runtime descriptor only

the current length needs to be stored.

User-Defined Ordinal Types- An ordinal type is one in which the range of possible values can

be easily associated with the set of positive integers.

Enumeration TypesAn enumeration type is one in which all of the possible values, which are symbolicconstants are enumerated in the definition.

Ex. (Ada)Type DAYS is (Mon, Tue, Wed, Thu, Fri, Sat, Sun);

Design Issues:

Is a literal constant allowed to appear in more thanone type definition?

And if so, how is the type of an occurrence of theliteral in the program checked?

Limited dynamic strings

Maximum length

Current length

Address


20/36



20

Designs:

Pascal: Not allowed to be used in more than one

enumeration type definition Enumeration type variables can be used as

- array subscript

- for loop variables- case selector expressions

Can be compared using relational operatorExample:

type colortype = (red, blue, green, yellow);var color : colortype;. . .color := blue;if (color > red) . . .

Ada:

Literals are allowed to appear in more than onedeclaration in the same referencing environment.These are called overloaded literals.

Example:

type LETTERS is (A, B, C, D, E, F,G, H, I, J, K, L,M, N, O, P, Q, R,S, T, U, V, W, X,Y, Z);

type VOWELS is (A, E, I, O, U);

for LETTER in A..U loop (ambiguous)

for LETTER in VOWELS(A)..VOWELS(U) loop

Evaluation:

Common operations for enumeration types are for predecessor, successor, positionin the list of values, and value for a given position number. In Pascal, these

operations are provided by built-in functions. Example, pred(blue) is red. In Ada,

they are attributes. For example, LETTERPRED(B) is A.

Enumeration types provide greater readability in a very direct way: Named values

are easily recognized, whereas coded values are not. Also provides type checking.

Subrange TypesIs a contiguous subsequence of an ordinal type. For example,

12..14 is a subrange of integer type.


21/36



21

Evaluation:

Subrange types enhance readability by making it clear toreaders that variables of subtypes can store only certain

ranges of values.

Reliability is increased with subrange types, becauseassigning a value to a subrange variable that is outside the

specified range is detected as an error by the run-timesystem.

- Implementation of User-Defined Ordinal TypesEnumeration types are usually implemented by associating a

non-negative integer value with each symbolic constant in thetype.

Typically, the first enumeration value is represented as 0, the

second as 1, and so forth. As long as the association is constant,

the integers can be used in place of the enumeration constants.Of course, the operation allowed is dramatically different from

those of integers, except in the relational operators, which areidentical.

In ANSI C and C++ enumeration types are often treated exactly

like integers.

Subrange types are implemented in exactly the same way as

their parent types, except the range checks must be included inevery assignment. This increases code size and execution timebut is usually considered well worth the cost. Also, a good

optimizing compiler can optimize some of the checking away.

Array Types- An array is a homogeneous aggregate of data elements in which

the individual element is identified by its position in theaggregate, relative to the first element.

- Arrays are referenced by means of two-level syntacticmechanism: Aggregate name Subscripts and indexesSyntax. array_name[index] element

Static binding binding of subscript type to an array variable

Dynamic binding binding of subscript value ranges


22/36



22

Four Categories of Arrays

1. Static Array the subscript ranges are statically bound andstorage allocation is static (done before runtime). The advantageof static arrays is efficiency: No dynamic allocation or

deallocation is requires.

2. Fixed Stack-Dynamic Array the subscript ranges arestatically bound but the allocation is done at declarationelaboration time during execution. The advantage of fixed stack-dynamic arrays over static arrays is space efficiency. A large

array in one procedure can use the same space as a large array

in a different procedure, as long as both procedures are neveractive at the same time.

Eg.

A:array[1..10] of integer; (Pascal)

int A[10]; (C/C++)

3. Stack-Dynamic Array subscript ranges are dynamically boundand storage allocation is dynamic (Done during run time).

Once the subscript ranges is bound and the storage is allocated, they remainfixed during the lifetime of the variable. Its major advantage over the latter is

flexibility.

Eg. Ada

Get (LIST_LEN);declare

LIST : array (1..LIST_LEN) of INTEGER;

begin. . .

end;

4. Heap-Dynamic Array the binding of subscript ranges andstorage allocation is dynamic, and can change any number oftimes during the arrays lifetime.

Arrays can grow and shrink during execution as the need forspace changes.

Eg. Visual Basic

Dim StudArr() as String;Redim StudArr(10) as String;Redim Preserve StudArr(15) as String;

The number of subscripts in arrays may vary.Eg.FORTRAN I: Limited to 3 dimensions only

FORTRAN IV: Up to 7 dimensions

Contemporary Language : no limitation


23/36



23

Array InitializationFortran 77:

INTEGER LIST (3)DATA LIST /O, 5, 5/

ANSI C/C++:int list[] = {4, 5, 7, 83};char name[] = Freddie;char *names[] = {Jo, Bob, Jake, Darcie};

Ada:LIST : array (1..5) of INTEGER := (1, 3, 5, 7, 9);BUNCH : array (1..5) of INTEGER := (1=>3, 3=>4, others=>0);

Slices- A slice of an array is some substructure of that array

See example on Page 213

Implementation of array types

- Requires more compile-time effort that simple built-in data types- The code to allow accessing of array elements must be generated

at compile time.

- At runtime, this code must be executed to produce elementaddresses

- Two ways to map multi-dimensional arrays to one-dimensional:1. Row Major Order2. Column Major Order

- The compile-time descriptor for single-dimensional arrays- The information in the descriptor is required to construct the

access function.

Record TypesA record is a possibly heterogeneous aggregate of data elements in which the

individual elements are identified by names

Records vs. Arrays

- heterogeneous - homogeneous

- fields are named w/ identifiers - referenced by index- allow to include unions

ARRAY

Element type

Index Type

Number of dimensions

Index


24/36



24

Pascal:

empRec = recordfn, mi, ln : string[30];dept : string[2];

endvar emp: empRec;begin

writeln(emp.dept);end.

C:

stypedef struct: empRec{

char[30] fn, mi, ln;char[2] dept;

}

void main(void)

{empRec emp;printf(%s, emp.dept);

}

To reference an element: Dot notation (Pascal and C/C++) % notation (Fortran)

Fully qualified reference all intermediate record names form thelargest enclosing record to the specified field are named in the reference.

employee.name := Bob;

employee.age := 42;employee.sex := Memployee.salary := 23750.00;

Elliptical reference - record names can be omitted.

with employee dobegin

name := Bob;age := 42;sex := M;salary := 23750.0;

end; {end of with}


25/36



25

Implementation of Records:

Field 1

Field n

Union TypesA union is a type that may store different type values at different times

during program execution

Design Issues: Should type checking be required? Should unions be embedded in records?

FORTRAN Union Types (EQUIVALENCE)INTEGER XREAL YEQUIVALENCE (X, Y)

- X and Y are to cohabit the same storage location- X and Y are aliases- No type checking is done

ALGOL 68 Union Types

- The current type value could be detected during runtime- Discriminated Union uses a tag or discriminant- Tag/discriminant identifies the current type value stored

union (int, real) ir1, ir2

union (int, real) ir1;int count;. . .ir1 := 33;. . .count := ir1;

- Conformity clauses solves the problem type checking forunion types

union (int, real) ir1;int count;real sum;. . .

Record

name

t e

offset

name

t e

offset

address

- fields of records are storedin adjacent memorylocations

-field accesses are allhandled using the offsets

The first assignment is legal, but the

second is not because the system cannotstatically check the type ofir1.


26/36



26

case ir1 in(int intval) : count := intval,(real realval) : sum := realval

esac

PASCAL Union Types-

Union is integrated with a record structure- Uses tag or discriminant- Called a Records Variant

type shape = (circle, triangle, rectangle);object = record

case form : shape ofcircle: (diameter : real)triangle: (leftside : integer;

rightside : integer;angle : real);

rectangle: (side1 : integer;side2 : integer)

end;

var figure : object;

Discriminant (form)

Problem: user program can change the tag without making the

corresponding change in the variant.

Eg.tag := circle;figure.side1 := 25;

ADA Union Types- The tag cannot be changed without making the corresponding

change in the variant.

- Checking the tag is required for all references to variants.- Constrained variant variable storing only 1 possible type

values in the variant thus allowing static type checking. Tag is treated as named constant

- Unconstrained variant variable values of the variant can bechanged during execution, however, the whole record should be

changed including the tag

Circle:diameter

Rectangle: side1, side2

Triangle: leftside, rightside, angle


27/36



27

type SHAPE is (CIRCLE, TRIANGLE, RECTANGLE);type OBJECT (FORM : SHAPE) is record

case FORM iswhen CIRCLE => DIAMETER : FLOAT;when TRIANGLE => LEFT_SIDE : INTEGER;

RIGHT_SIDE : INTEGER;

ANGLE : FLOAT;when RECTANGLE => SIDE_1 : INTEGER;SIDE_2 : INTEGER;

end case;end record;

- FIGURE_1 : OBJECT; // unconstrained no initial values- FIGURE_2 := OBJECT (FORM => TRIANGLE); // constrained

Set TypesA set is one whose variables can store unordered collections of distinct

values from some ordinal type called its base type.

Set types are often used to model mathematical sets.

Sets in Pascal and Modula-2- represent sets as bit string that fit into a single machine word.- Set operations:

Set union Set intersection Set difference Set equality

type colors =(red, blue, green, yellow, orange, white, black);colorset = set of colors;

var set1, set2 : colorset;

Constant values can be assigned to the set variables set1 and set2 as in

set1 := [red, blue, yellow, white];set2 := [black, blue];

- Set types are usually stored as bit strings in memory- Present element set to 1 (set bit)- Absent element set to 0 (clear bit)

Set Operations:

type chars = a. . . g

charset = set of chars

var set1, set2 : charsetset3 : charset

beginset1 = [a, c, f, e] // 1010110set2 = [a, b, c, g] // 1110001


28/36



28

1. Union: set1 U set2set3 := set1 + set2;set3


29/36



29

2 fundamental pointer operations:

1. Assignment set a pointer variable to the address of some object2. Dereferencing allows a pointer to be followed to the data object to

which it points.

2 Problems that can be encountered when performing pointer operations:

1. Dangling Pointers or dangling reference2. Garbage (lost object)- In most languages, pointers are used in heap management

2 types of heap elements:1. Fixed-size allocation heap

- All heap storage are allocated and deallocated in units of a singlesize

- All cells are linked together using the pointers in the cells,forming the free space list.

- Allocation depends on the next available space]- A dynamic variable can be pointed out by more than one pointer,

making it impossible to determine when the variable is no longeruseful to the program.

- Creation of a collection of cells that are no longer accessible andshould be deallocated is also possible

Ex.

var p, q, r : ^integer; i: integer;begin

new(p); p^ := 4;new(q) q^ := 5;new(p); p^ := 3;q := p;dispose(p);new(r); r^ := 5;q^ := 0;new(p); p^ := 5;i := p^/r^;

end;

Solutions to the Dangling Pointer Problem1. Use of Tombstones

- The actual pointer variable points only to tombstonesand never to dynamic variables.

- When a dynamic variable is deallocated, the tombstoneremains but is set to nil, indicating that the dynamicvariable no longer exists.

Tombstone Dynamic Variable


30/36



30

2. Locks-and-Keys- Pointer values are represented as ordered pairs, where

the key is an integer value.- Dynamic variables are represented as storage for the

variable plus a header cell that stores an integer lockvalue.

Ways to reclaim garbage

1. Reference Counters (eager approach)- Reclamation is incremental and is done when

inaccessible cells are created.

2. Garbage Collection (lazy approach)- Reclamation only occurs when the list of available space

becomes empty.

Abstract Data TypesData Abstraction

An Abstract Data Type is defined as:

1. A set ofdata objects, ordinarily using one or more type definitions2. A set ofabstract operations on those data objects and3. Encapsulation of the whole in such a way that the user of the new

type cannot manipulate data objects of the operations defined.

Basic Terminologies:

Information Hiding - clients cannot change the underlying representation ofobjects directly.

Type Definitions defines the structure of a data object with its possiblevalue bindings.

Example:

class my_stack{private

int top, element[n];public

my_stack();void pop(int *item);void push(int item);void s_top();int s_empty();

};

my_stack::my_stack(){top = 0}void my_stack::pop(int *item){

if (top==0) printf(Stack Empty!);else {*item = element[top];

top--;}

}


31/36



31

void my_stack::push(int item){if(top==(n-1)) printf(Stack Full!);

else { element[top]=item;top++;

}}int my_stack::s_empty(){if(top==0) return 1;

else return 0;}

void my_stack::s_top(){if(top==0)printf(Stack Empty);

else printf(%d\n, element[top-1]);}

class my_stack

void pop(int *item);void push(int item);void s_top();int s_empty();

Name of the abstract data object.Creation of an object of type

my_stack is similar to declaring an

ordinary variableMy_stack S1, S2;

These are the methods or functionsdeclared inside my_stack. These

methods can only be accessed bymy_stack objects, making itencapsulated.


32/36



32

SYNTAX AND SEMANTICS

4.1 Introduction

Programming language implementers must be able to determine how theexpressions, statement, and program units of a language are formed, and also their

intended effect when executed.

Syntax is the form of its expressions, statements, and program units.Semantics is the meaning of those expressions, statements, and program units.

4.2 Syntax

A language is the set of strings of characters from some alphabet

A sentence or statement is the strings of a language.

The syntax rules of a language specify which strings of characters from the

languages alphabet are in the language.

A lexeme is a small syntactic unit of a language.

A program is a string of lexemes.

A token of a language is a category of its lexeme.

Example: C statement:

index = 2 * count + 17;

Lexemes Tokensindex identifier= equal_sign2 int_constant* mult_opcount indentifier+ plus_op17 int_constant; semicolon

4.3 Formal Methods of Describing Syntax

BACKUS-NAUR FORM (BNF)- Used to specify programming language syntax- A metalanguage (a language that is used to describe another language)Four components:

- set of production rules or grammar- set of nonterminal symbols- set of terminal symbols- start symbol


33/36



33

LHS RHS

Example:

=

- The symbol on the left side of the arrow, which is aptly called the left-hand side (LHS), is the abstraction being defined.- The text to the right of the arrow is the definition of the LHS. It is called

the right-hand side (RHS) and consists of some mixture of tokens,lexemes, and references to other abstractions.

- Altogether, the definition is called a rule, or production.- The abstraction in a BNF description, or grammar, are often called

nonterminal symbols, or simply nonterminals.- The lexemes and tokens of the rules are called terminal symbols, or

simply terminals.

- A grammar is simply a collection of rules.- Nonterminal symbols can have two or more distinct definitions,

representing two or more possible syntactic forms in the language,

separated by the symbol |, meaning logical OR.

Example: PASCAL if

if then if then else

or with the rule

if then

| if then else

Given the following statement, write the corresponding BNF.1. var a: integer;

var : ; a | b | c

integer | char | string | real

2. int a; ;

int | float | char

Accepted vs. Rejected Sentences- A string or sentence is accepted if it is part of the languages alphabet- 2 ways to check if a string is accepted:

1. DERIVATION2. PARSE TREES

Derivation- It is the process of generating the valid or accepted sentences of a

language by applying a sequence of the production rules, beginning withthe start symbol.

- 2 types:1. Leftmost Derivation always replace the leftmost nonterminal2. Rightmost Derivation always replace the rightmost nonterminal


34/36



34

- A string is accepted if at the end of the derivation only terminal symbolswere left; otherwise, the string is rejected.

- Example: assignment begin end

| ; :=

A | B | C

+

| -

|

- A derivation of a program in this language follows: => begin end

=> begin ; end=> begin := ; end

=> begin A := ; end

=> begin A := + ; end=> begin A := B + ; end

=> begin A := B + C; end

=> begin A := B + C; end=> begin A := B + C; := end=> begin A := B + C; B := end

=> begin A := B + C; B := end

=> begin A := B + C; B := C end

- Another example: Simple assignment statements :=

A | B | C +

| * | ()

|

A := B * (A + C)

Is generated by the leftmost derivations:

=> := => A :=

=> A := *

=> A := B *

=> A := B * ()=> A := B * ( + )=> A := B * (A + )

=> A := B * (A + )

=> A := B * (A + C)Therefore, the statement is accepted.

Will the statement, B := A * (A * B + C) be accepted or rejected?


35/36



35

PARSE TREE

- A hierarchical syntactic structure. It pictorially shows how the start symbolof a grammar derives a string in the language.

Root node start symbol Interior nodes nonterminal symbols Leaf nodes token or terminal symbolsNote:

If a grammar that generates a sentence for which there are two or moredistinct parse trees is said to be ambiguous, therefore has more than one

meaning that may cause misinterpretation.

- Example: assignment begin end

| ; := A | B | C

+ | -

|

Parse Tree Representation: begin A:= B + C; B := C end

begin end

;

:=

A + :=

B C B

C

Exercise:S aAS

| aA SbA

| SS| ba


36/36

Programming Languages 36

Determine whether the following statement is ACCEPTED or REJECTED

1. aabbaa

2. abba3. abaabaa

EXTENDED BNF (EBNF)- Increases readability and writability

Meta symbols or notations used:

| Multiple Choice (used to represent alternative definitions)[ ] Optional Part

{ }* Repeated zero or more times

{ }+ Repeated one or more times

Example:

BNF: +

| -

| *

| / |

EBNF: {(+ | -) }*

{(* | /) }*

Given the C variable declaration, write the corresponding EBNF

int A;int A, B;char C, D, E;

Given the if-then-else statement in ADA, write the corresponding EBNF

1. if then ;2. if then ;

else ;3. if then ;

elsif then ;else ;

4. if then ;elsif then ;

elsif then ;

else ;

Documents

Manual on Programming Languages