Upload
jhun-ngipol-jr
View
217
Download
0
Embed Size (px)
Citation preview
7/31/2019 Manual on Programming Languages
1/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
1
PRELIMINARIES
1.1 Reasons for Studying Concepts of Programming Languages Increased capacity to express ideas.
- It is difficult for people to conceptualize structures that they cannot describe,verbally or in writing.- Awareness of a wider variety of programming language features can reduce
limitations in software development.
- Programmers can increase the range of their software-development thoughtprocesses by learning new language constructs.
- Builds an appreciation for valuable language features and encouragesprogrammers to use these features.
Improved background for choosing appropriate languages.- Many professional programmers have had little formal education in computer
science and were trained on the job or through in-house training programs.
- Many other programmers received their formal training in the early days ofcomputer science education, when few languages were not widely known.
- The result of this narrow background is that many programmers, when givena choice of languages for a new project, continue to use the language withwhich they are most familiar, even if it is poorly suited to the new project.
- If these programmers were familiar with the other languages, they would bein a better position to make informed language choices.
Increased ability to learn new languages.- Computer programming is a young discipline, and design methodologies,
software development tools, and programming languages are still in a state of
continuous evolution.- The process of learning a new programming language can be lengthy and
difficult, especially for someone who is comfortable with only one or twolanguages and has never examined programming language concepts in
general.- Once a thorough understanding of the fundamental concepts of languages is
acquired, it becomes far easier to see how these concepts are incorporated
into the design of the language being learned.
- It is essential that practicing programmers know the vocabulary andfundamental concepts of programming languages so they can read andunderstand programming language manuals and sales literature for languages
and compilers.
Better understanding of the significance of implementation.- Allows us to visualize how a computer executes various language constructs.
Understand relative efficiency of alternative constructs that may be chosen for
a program.
- This in turn leads to the ability to use a language more intelligently, as it wasdesigned to be used.
- We can become better programmers by understanding the choices amongprogramming language constructs and the consequences of those choices.
7/31/2019 Manual on Programming Languages
2/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
2
- Certain kinds of program bugs can only be found and fixed by a programmerwho knows some related implementation details.
Increased ability to design new languages- To a student, the possibility of being required at some future time to design a
new programming language may seem remote.- However, most professional programmers occasionally do design languages of
one sort or another.
Overall advancement of computing.- Finally, there is a global view of computing that can justify the study of
programming language concepts.- Although it is usually possible to determine why a particular programming
language became popular, it is not always clear, at least in retrospect, thatthe most popular languages are the best available.
- In some cases, it might be concluded that a language become widely used, atleast in part, because those in positions to choose languages were not
sufficiently familiar with programming language concepts.- In general, if those who choose languages are better informed, better
languages will more quickly squeeze out poorer ones.
1.2 Programming Domains Scientific Applications
- Typically, scientific applications have simple data structures but require largenumbers of floating-point arithmetic computations.
- For some scientific applications where efficiency is the primary concern, likethose that were common in the 1950s and 1960s, no subsequent language issignificantly better than FORTRAN.
Business Applications- The use of computers for business applications began in the 1950s.- The first successful high-level language for business was COBOL which
appeared in 1960.
- Business languages are characterized, according to the needs of theapplication, by elaborate input and output facilities and decimal data types.
- With the advent of microcomputers came new ways for businesses, especiallysmall businesses, to use computers. Two specific tools, spreadsheet systemsand database systems, were developed for business and now are widely used.
Artificial Intelligence- AI is a broad area of computer applications characterized by the absence of
exact algorithms and the use of symbolic computations rather than numeric
computation.- Symbolic computation means that symbols, consisting of names rather than
numbers, are manipulated.
7/31/2019 Manual on Programming Languages
3/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
3
- The first widely used programming language developed for AI applicationswas the functional language LISP (Scheme) which appeared in 1959.
- An alternative approach to these applications appeared in the early 1970s:logic programming using Prolog language.
Systems Programming Languages- The operating system and all of the programming support tools of a computer
system are collectively known as its systems software.- Systems software is used almost continuously and therefore must have
execution efficiency.- A language for this domain must have low-level features that allow the
software to external devices to be written.
- In the 1960s and 1970s, some computer manufacturers, such as IBM,Digital, and Burroughs (now UNISYS) developed special machine-orientedhigh-level languages for systems software on their machines. For IBMmainframe computers, the language was PL/S, a dialect of PL/I; for Digital, it
was BLISS, a language at a level just above assembly language; for
Burrougs, it was Extended ALGOL.
- The UNIX operating system is written almost entirely in C, which was made itrelatively easy to port, or move, to different machines.
Very High-Level Languages (VHLLs)- The languages in the category called very high-level have evolved slowly over
the past 25 years.- The various scripting languages for UNIX are examples of VHLLs. A scripting
language is one that is used by putting a list of commands, called a script, in
a file to be executed.- The first of these languages, named shell, began as a small collection of
commands that were interpreted to be calls to system subprograms that
performed utility functions, such as file management and simple file filtering.- Other VHLLs are awk, for report generation; tcl combined with tk, which
provide a method for building X Window applications. The perl is acombination ofshell and awk.
Special-Purpose Languages- A host of special-purpose languages have appeared over the past 40 years.- They range from RPG, which is used to produce business reports, to APT,
which is used for instructing programmable machine tools, to GPSS, which isused for systems simulation.
7/31/2019 Manual on Programming Languages
4/36
7/31/2019 Manual on Programming Languages
5/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
5
Where Reg1 and Reg2 are registers. The semantics are:
Reg1 contents(Reg1) + contents(memory_cell)
Reg1 contents(Reg1) + contents(Reg2)
VAX superminicomputers (orthogonal)ADDL operand_1, operand_2
Where the semantics is:Operand_2 contents(operand_1) + contents(operand_2)
ALGOL 60 too orthogonal
ex. Record + array (no restrictions on the types)
if A == B = x+9 (condition, assignment & arithmetic)
Control Statements- Facilities to transfer control of the program execution from one
program part to another.- Indiscriminate use of goto statements severely reduces program
readability.
Remedy when using goto:
1. They must precede their targets, except when used for loops2. Their targets must never be too distant3. Their numbers must be limited
- Control statement deign of a language can be important factor in thereadability of programs written in that language.
Data Types and Structures- The presence of adequate facilities for defining data types and data
structures in a language is another significant aid to readability.Ex. Sum_is_too_big := 1 (no Boolean types)
Sum_is_too_big := true
Record data type vs. collection of similar arrays
Syntax Considerations- The syntax, or form, of the elements of a language has a significant effect
on the readability of programs. The following are three examples of
syntactic design choices that affect readability:
1. Identifier forms- length of the identifier (names)- case sensitivity
- presence of connectors2. Special words
- delimiters
- short-circuit evaluation
3. Form and meaning- the meaning has to agree/follow the form or syntax.
Writability- A measure of how easily a language can be used to create programs for a
chosen problem domain.- Most of the language characteristics that affect readability also affect
writability.
7/31/2019 Manual on Programming Languages
6/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
6
Simplicity and Orthogonality- a smaller number of primitive constructs and a consistent set of rules for
combining them (that is, orthogonality) is much better than simply havinga large number of primitives.
- A programmer can design solution to a complex problem after learningonly a simple set of primitive constructs.
- BUT too much orthogonality can be a detriment to writability. Errors inwriting programs can go undetected when nearly any combination ofprimitives is legal. Leading to misuse or disuse.
Support for Abstractions- Hiding the details of implementation- The degree of abstraction allowed by a programming language and the
naturalness of its expression are very important to its writability.- Programming languages can support two distinct categories of abstraction
1. Data AbstractionEx. Binary tree
2. Process AbstractionEx. subprograms
Expressivity- There are very powerful operators that allow great deal of computation to
be accomplished with a very small program.- A language has relatively convenient, rather than cumbersome, ways of
specifying computations.
Reliability A program is said to be reliable if it performs to its specifications under all
conditions.1. Type Checking
- Testing for type errors in a given program, either by the compiler orduring program execution.Ex. Type compatibility b/n 2 variables.
- The earlier errors in programs are detected, the less expensive it is tomake the required repairs.
- Consider space, time and accuracy2. Exception Handling
- The ability of a program to intercept run-time errors (as well as otherunusual conditions detected by the program), take correctivemeasures, and then continue execution.
3. Aliasing- having two distinct referencing methods, or names, for the same
memory cell.- It is now widely accepted that aliasing is a dangerous feature of
programming language.
4. Readability and Writability- The easier a program is to write, the more l ikely it is to be correct.- Programs that are difficult to read are difficult both to write and to
modify.
7/31/2019 Manual on Programming Languages
7/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
7
Cost Cost of training programmers to use the language Cost of writing programs in the language Both costs of training and writing programs Cost of compiling programs Cost of executing programs Cost of compilers Cost of poor reliability Cost of maintaining programs Optimization is the name given to collection of methods that compilers may
use to decrease the size and/or increase the execution speed.
Criteria for evaluation:1. Portability
- ease with which programs can be moved from one implementation toanother. (Standardization)
2. Generality- the applicability to a wide range of applications.
3. Well-definedness- the completeness and precision of the languages official defining
document
1.4 Influences on Language Design Computer Architecture
- The basic architecture of computers has a crucial effect on the languagedesign. Most of the popular languages of the past 35 years have been
designed around the prevalent computer architecture, called the VonNeumann architecture. These languages are called imperative languages.In a von Neumann computer, both data and programs are stored in the same
memory. The CPU executes instructions, is separated from the memory.- Central features of imperative language are:
1. Variables which models the memory cells2. Assignment statements piping operations3. Iteration construct repetition
Programming Methodologies Data-Oriented
- Simply put, data-oriented methods emphasize data design, concentratingon the use of logical, or abstract, data types to solve problems.
- Objected-oriented methodology begins with data abstraction, whichencapsulates processing with data objects and hides access to data, and
adds inheritance and dynamic type bindings. Inheritance is a powerfulconcept that greatly enhances the possibility of reuse of existing software.
Reuse of software components promises to significantly increase softwaredevelopment productivity.
Process-Oriented- Opposite of data-oriented programming.- Focuses on concurrency
7/31/2019 Manual on Programming Languages
8/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
8
1.5 Language Design Trade-offs The task of choosing constructs and features when designing a programming
language involves a collection of compromises and trade-offs
-Conflicting criteria:1. Reliability vs. Cost of execution2. Writability vs. Readability3. Flexibility vs. Safety
1.6 Implementation Methods Compilation translates programs from some high-level instructions to machine
language, which can be executed directly on the computer.
Lexical units
Parse trees
Intermediate code
Machine language
Input Data
Results
SourceProgram
Lexicalanalyzer
Syntaxanalyzer
IntermediateCode generator(and semantic
analyzer)
Codegenerator
Computer
Symboltable
Optimization(optional)
7/31/2019 Manual on Programming Languages
9/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
9
Pure Interpreter programs are interpreted by another program (called theinterpreter) without going through any form of translation.
Input data
Results
Hybrid Implementation Systems translates high-level language programs toan intermediate language designed to allow easy interpretation.
Lexical units
Parse trees
Intermediate code
Input Data
Results
SourceProgram
Interpreter
SourceProgram
Lexicalanalyzer
Syntaxanalyzer
IntermediateCode generator
Interpreter
7/31/2019 Manual on Programming Languages
10/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
10
1.7 Programming Environments The collection of tools used in the development of software
1. File system secondary memory2. Text editor w/ debugger, optimizer3. Linker preliminary step in the completion of the result4. Compiler large collection of integrated tools
1.8 Programming Paradigms A Programming Paradigm is a problem-solving approach
FORTRAN Experimental Applied to Spreadsheet PackagesALGOL PROLOG queryCOBOL languagesBASIC 4GLsPascalCModula-2
ADA
LISP APL SNOBOL PROLOG VBLogo PERL Level 5 C++Scheme Java
Programming Language
Process Oriented Data Oriented
Imperative Data Flow Functional Constraint Rule Object Database
ListProcessing
ArrayProcessing
StringProcessing
ProductionSystem
Logic AccessOriented
ObjectOriented
7/31/2019 Manual on Programming Languages
11/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
11
NAMES, BINDINGS, TYPE CHECKING, ANS SCOPES
2.1 Names Associated with variables, labels, subprograms and formal parameters Design Issues:
1. What is the maximum length of a name?2. Can connector characters be used in names?3. Are names case sensitive?4. Are the special words reserved words or keywords?
Special Words- Used to make programs more readable- Used to separate the syntactic entities of programs- Keyword is a word in PL that is special only in certain context
Ex. PASCAL
var true:integer;flag:boolean;
beginflag := true;
true := 1;end;
- Reserved word a special word that cannot be used as a nameEx. C
int float=2;
/*you cannot use void as a variable name since it is a reserved word
that signifies a data type */- Predefined names names that have predefined meaning but can be
redefined by the user. Must be visible to the compiler when used.
Ex. C
clrscr();
PASCAL
writeln(); readln();
2.2 Variables An abstraction of the computer memory cell or collection of cells. A variable can be characterized as a sextuple of attributes:
1. Name as discussed in 1.1. often referred to as identifiers.2. Address is the memory address with which it is associated3. Value is the contents of the memory cell or cells associated with it.4. Type determines the range of values the variable can have and set of
operations that are defined for values of the type.
5. Lifetime is the time during which the variable is bound to a specific memorylocation6. Scope The scope of such a variable is from its declaration to the end
reserved word of the procedure.
7/31/2019 Manual on Programming Languages
12/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
12
2.3 The Concept of Binding Binding is an association, such as between an attribute and an entity or between
an operation and a symbol.
Binding Time is the time at which a binding takes place. Bindings can take place at language design time, language implementation time,
compile time, link time, load time, or run time.
Ex. Cint count;. . .count = count +5;
Some of the bindings and their binding times for the parts of this assignment
statement are as follows:
Set of possible types for count: bound at language design time. Type of count: bound at compile time. Set of possible values of count: bound at compiler design time. Value of count: bound at execution time with this statement. Set of possible meanings for the operator symbol +: bound at language
definition time Meaning of the operator symbol +: bound at compile time. Internal representation of the literal 5: bound at compiler design time.
Binding of Attributes to Variables- A binding is static if it occurs before run time and remains unchanged
throughout program execution.- A binding is dynamic if it occurs during run time or can change in the course
of program execution.
Type Bindings- Before a variable can be referenced in a program, it must be bound to a data
type.- The two importance aspects of this binding are how the type is specified and
when the binding takes place.- Types can be specified statically through some form of explicit or implicit
declaration.
Explicit declaration is a statement in a program that lists variable namesand declares them to be of a particular type.
Implicit declaration is a means of associating variables with typesthrough default conventions instead of declaration statements.
Dynamic Type Binding- The type is not specified by a declaration statement.- The variable is bound to a type when it is assigned a value in an assignmentstatement.- When the assignment statement is executed, the variable being assigned is
bound to the type of the value, variable or expression on the right side of theassignment.
- The primary advantage of dynamic binding of variables to types is that itprovides a great deal of programming flexibility.
7/31/2019 Manual on Programming Languages
13/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
13
Ex. SNOBOL4
LIST 10.2 5.1 0.0 (causes LIST to be 1 dimensional array)
LIST 47 (causes LIST to be integer variable)
- Are often implemented using interpreters rather than compilers. This ispartially because it is difficult to change dynamically the types of variables in
machine code.
- There are two disadvantage of dynamic type binding.1. Error detection capability of the computer is diminished relative to a
compiler for a language with static type bindings, because any two types
can appear on opposite sides of the assignment operator.2. The cost of implementing dynamic attribute binding is considerable,
particularly in execution time. Type checking must be done at run time.
Furthermore, every variable must have a descriptor associated with it tomaintain the current type. The descriptors must also be of varying sizebecause more space is needed if the variable is a structured type than if it
is a primitive type.
Type Inference- Inferencing mechanism, in which the types of most expressions can be
determined without requiring the programmer to specify the types of thevariables.Ex. fun circumf (r) = 3.14159 * r * r;
- a function that takes real argument and produces real result.
fun times10 (x) = 10 * x;
- argument and functional value are inferred to be type integer.
fun square (x) = x * x;
- cannot be inferred. Instead explicitly define as:
fun square (x) : int = x * x;fun square (x : int) = x * x;
fun square (x) = (x : int) * x;
fun square (x) = x * (x : int);
- Type inference is also used in the purely functional language. Storage Bindings and Lifetime
- Allocation is the process where memory cell to which a variable is boundmust be somehow taken from a pool of available memory.
- Deallocation is the process of placing a memory cell that has been unboundfrom a variable back into the pool of available memory.
- Lifetime of a program variable is the time during which the variable is boundto a specific memory location. So the lifetime of a variable begins when it isbound to a specific cell and ends when it is unbound from that cell.- Static Variables
- Those that are bound to memory cells before program executionbegins and remain bound to those same memory cells until programexecution terminates.
- The greatest advantage of static variable is efficiency. All addressing ofstatic variables can be direct.
7/31/2019 Manual on Programming Languages
14/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
14
- No run-time overhead is incurred for allocation and deallocation.- Disadvantage of static binding to storage is reduced flexibility; in
particular, in a language that has only variables that are staticallybound to storage, recursive subprograms are not supported.
- Static-Dynamic Variables- Those whose storage bindings are created when their declaration
statements are elaborated, but whose types are statically bound.- Elaboration of such a declaration refers to the storage allocation and
binding process indicated by the declaration, which takes place whenexecution reaches the code to which the declaration is attached.
- Elaboration occurs during run time.- In C, local variables are by default stack-dynamic but can be made
static by including the static qualifier to their definitions.
- Explicit Heap-Dynamic Variables- Are nameless objects whose storage is allocated and deallocated by
explicit run-time instructions specified by the programmer. Thesevariables, which are allocated from and deallocated to the heap, can
only be referenced through pointer variables.- An example using C++ code segment
int *intnode;. . .
intnode = new int; // allocates an int object. . .delete intnode; // deallocates object to w/c intnode points
- Explicit heap-dynamic variables are often used for dynamic structures,such as linked lists and trees, that need to grow and shrink duringexecution.
- The disadvantages of such variables are the difficulty of using themcorrectly and the cost of references, allocations and deallocations.
- Implicit Dynamic Variables- Are bound to heap storage only when they are assigned values. In
fact, all their attributes are bound every time they are assigned.- The advantage of such variables is they have the highest degree of
flexibility, allowing highly generic code to be written.- The disadvantage is the run-time overhead of maintaining all the
dynamic attributes, which could include array subscript types andranges, among others. Another is the loss of some error detection bycompiler.
2.4 Type Checking- Is the activity of ensuring that the operands of an operator are of compatible
types.
- A compatible type is one that is either legal for the operator or is allowedunder language rules to be implicitly converted by compiler-generated code to
a legal type.- This automatic conversion is called coercion.- A type error is the application of an operator to an operand of an
inappropriate type.- If all bindings of variables to types are static in a language, then type
checking can nearly always be done statically.
7/31/2019 Manual on Programming Languages
15/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
15
- Dynamic type binding requires type checking at run time, which is calleddynamic type checking.
2.5 Strong Typing- A strongly typed language is one in which each name in a program in thelanguage has a single type associated with it, and that type is known at
compile time.- All types are statically bound.- A PL is said to be strongly typed if type errors are always detected.- The importance of strong typing lies in its ability to detect all misuses of
variables that result in type errors.
- A strongly typed language also allows detection, at run-time, of uses of theincorrect type values in variables that can store values of more that one type.
2.6 Type Compatibility- Name type compatibility means that two variables have compatible types
only if they are either the same declaration or in declarations that use the
same type name.- Structure type compatibility means that two variables have compatible
types if their types have identical structures.
2.7 Scope- The scope of a program variable is the range of statements in which the
variable is visible.- A variable is visible in a statement if it can be referenced in that statement.- A variable is Local in a program unit or block if it is declared there.- The Nonlocal variables of a program unit or block are those that are visible
within the program unit or block but are not declared there.
Static Scope- Scope of a variable can be statically determined, that is, prior to execution- See example in page 172
Blocks- Section of code
Dynamic Scope- Is based on the calling sequence of subprograms, not on their spatial
relationship to each other.- The scope can be determined only at run time.- See example in Page 177.
Evaluation of Dynamic Scoping- The correct attributes of non-local variables visible to a program statement
cannot be determined statically.- Several kinds of programming problems follow directly from dynamic
scooping.- Dynamic scooping results in less reliable programs than static scooping- Inability to statically type check references to non-locals.
7/31/2019 Manual on Programming Languages
16/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
16
- Dynamic scooping also makes programs much more difficult to read, becausethe calling sequence of subprograms must be known to determine the
meaning of references to non-local variables.- On the other hand, dynamic scooping can be used to advantage in
programming. Subprograms inherit the context of their callers
2.8 Scope and Lifetime- Relation: The scope of a variable is from its declaration to the end reserved
word of the procedure. The lifetime of that variable is the period of time
beginning when the procedure is entered and ending when execution of theprocedure reaches the end.
2.9 Referencing Environments- The referencing environment of a statement is the collection of all names that
are visible in the statement.
- The referencing environment of a statement in a static-scoped language is thevariables declared in its local scope plus the collection of all variables of its
ancestors scopes that are visible. (see pages 180-181 for examples.)
2.10 Named Constants- A named constant is a variable that is bound to a value only at the time it is
bound to storage; its value cannot be changed by assignment or by an input
statement.- Named constants are useful as aids to readability and program reliability.- Readability can be improved, for example, by using the name pi instead of
the constant 3.14159.
2.11 Variable Initialization- It is convenient for variables to have values before the code of the program
or subprogram in which they are declared begins executing.- The binding of a variable to a value at the time it is bound to storage is called
initialization.
- If the variable is statically bound to storage, binding and initialization occurbefore run time.
- If the storage binding is dynamic, initialization is also dynamic.
7/31/2019 Manual on Programming Languages
17/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
17
DATA TYPES
3.1 Introduction
Computer programs produce results by manipulating data. An important factor in
determining the ease with which they can perform this task is how well the datatypes match the real-world problem space. It is crucial, therefore, that a language
supports the proper variety of data types and structures.
3.2 Primitive Data Types
The data types that are not defined in terms of other types are called primitive data
types. Nearly all-programming languages provide set of primitive data types.
The primitive data types of a language are used, along with one or more type
constructors, to provide the structured types.
a. Numeric Types Integer
- The most common primitive data type- Represented in a computer by a string of bits with one of the bits
representing the sign.- Implementations:
Sign bit Binary Integer
Type Sign bit Binary integerDescriptor
Type Sign Binary
Descriptor bit integer
Floating Point- Model real numbers but the representation are only
approximation for most real numbers
- Have value ranges that are defined in terms of precision andrange. (Ex. , e)
- Problem: Loss of accuracy through arithmetic operations- Implementations:
Single precision8 bits 23 bits
I
I
ex onent fractionsb
7/31/2019 Manual on Programming Languages
18/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
18
Double precision11 bits 52 bits
Decimal- Store a fixed number of decimal digits, with the decimal point at
a fixed position in the value.- Uses the binary coded decimal (BCD) representation- Ex. PL/I:
DECLARE X FIXED DECIMAL(10,3)
COBOL:X PICTURE 999V99
Boolean Types- Has only 2 elements / range (true or false)- Often used for switches or flags- Could be represented by a single bit but the smallest addressable
unit is normally used
Character Types- Stored as numeric codings- Uses ASCII representation- JAVA uses the UNICODE representation
3.3 Structured Data Types
Character String Types-
One in which the object consist of sequences of characters- Design issues:
Should string be primitive type or simply a special kind ofcharacter array?
Should string have static or dynamic length?- String Length Options:
Static length stringEx:
A:String[20] (Pascal)Character (len=15) Name1, Name2 (Fortran)
Implementation:
- Require compile-time descriptor with field for length
exponent fractionsb
Static string
Length
Address
7/31/2019 Manual on Programming Languages
19/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
19
Limited dynamic lengtho Allow string to have varying length up to a
declared and fixed maximum set by the variable
definition.o Ex: char A[20];o Implementation:
Requires runtime descriptor to store boththe fixed maximum length and currentlength
Dynamic length stringo String have varying length with no maximumo Provides maximum flexibilityo Ex: Snobol4
Newline = trim(input)
o Implementation: Require a simpler runtime descriptor only
the current length needs to be stored.
User-Defined Ordinal Types- An ordinal type is one in which the range of possible values can
be easily associated with the set of positive integers.
Enumeration TypesAn enumeration type is one in which all of the possible values, which are symbolicconstants are enumerated in the definition.
Ex. (Ada)Type DAYS is (Mon, Tue, Wed, Thu, Fri, Sat, Sun);
Design Issues:
Is a literal constant allowed to appear in more thanone type definition?
And if so, how is the type of an occurrence of theliteral in the program checked?
Limited dynamic strings
Maximum length
Current length
Address
7/31/2019 Manual on Programming Languages
20/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
20
Designs:
Pascal: Not allowed to be used in more than one
enumeration type definition Enumeration type variables can be used as
- array subscript
- for loop variables- case selector expressions
Can be compared using relational operatorExample:
type colortype = (red, blue, green, yellow);var color : colortype;. . .color := blue;if (color > red) . . .
Ada:
Literals are allowed to appear in more than onedeclaration in the same referencing environment.These are called overloaded literals.
Example:
type LETTERS is (A, B, C, D, E, F,G, H, I, J, K, L,M, N, O, P, Q, R,S, T, U, V, W, X,Y, Z);
type VOWELS is (A, E, I, O, U);
for LETTER in A..U loop (ambiguous)
for LETTER in VOWELS(A)..VOWELS(U) loop
Evaluation:
Common operations for enumeration types are for predecessor, successor, positionin the list of values, and value for a given position number. In Pascal, these
operations are provided by built-in functions. Example, pred(blue) is red. In Ada,
they are attributes. For example, LETTERPRED(B) is A.
Enumeration types provide greater readability in a very direct way: Named values
are easily recognized, whereas coded values are not. Also provides type checking.
Subrange TypesIs a contiguous subsequence of an ordinal type. For example,
12..14 is a subrange of integer type.
7/31/2019 Manual on Programming Languages
21/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
21
Evaluation:
Subrange types enhance readability by making it clear toreaders that variables of subtypes can store only certain
ranges of values.
Reliability is increased with subrange types, becauseassigning a value to a subrange variable that is outside the
specified range is detected as an error by the run-timesystem.
- Implementation of User-Defined Ordinal TypesEnumeration types are usually implemented by associating a
non-negative integer value with each symbolic constant in thetype.
Typically, the first enumeration value is represented as 0, the
second as 1, and so forth. As long as the association is constant,
the integers can be used in place of the enumeration constants.Of course, the operation allowed is dramatically different from
those of integers, except in the relational operators, which areidentical.
In ANSI C and C++ enumeration types are often treated exactly
like integers.
Subrange types are implemented in exactly the same way as
their parent types, except the range checks must be included inevery assignment. This increases code size and execution timebut is usually considered well worth the cost. Also, a good
optimizing compiler can optimize some of the checking away.
Array Types- An array is a homogeneous aggregate of data elements in which
the individual element is identified by its position in theaggregate, relative to the first element.
- Arrays are referenced by means of two-level syntacticmechanism: Aggregate name Subscripts and indexesSyntax. array_name[index] element
Static binding binding of subscript type to an array variable
Dynamic binding binding of subscript value ranges
7/31/2019 Manual on Programming Languages
22/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
22
Four Categories of Arrays
1. Static Array the subscript ranges are statically bound andstorage allocation is static (done before runtime). The advantageof static arrays is efficiency: No dynamic allocation or
deallocation is requires.
2. Fixed Stack-Dynamic Array the subscript ranges arestatically bound but the allocation is done at declarationelaboration time during execution. The advantage of fixed stack-dynamic arrays over static arrays is space efficiency. A large
array in one procedure can use the same space as a large array
in a different procedure, as long as both procedures are neveractive at the same time.
Eg.
A:array[1..10] of integer; (Pascal)
int A[10]; (C/C++)
3. Stack-Dynamic Array subscript ranges are dynamically boundand storage allocation is dynamic (Done during run time).
Once the subscript ranges is bound and the storage is allocated, they remainfixed during the lifetime of the variable. Its major advantage over the latter is
flexibility.
Eg. Ada
Get (LIST_LEN);declare
LIST : array (1..LIST_LEN) of INTEGER;
begin. . .
end;
4. Heap-Dynamic Array the binding of subscript ranges andstorage allocation is dynamic, and can change any number oftimes during the arrays lifetime.
Arrays can grow and shrink during execution as the need forspace changes.
Eg. Visual Basic
Dim StudArr() as String;Redim StudArr(10) as String;Redim Preserve StudArr(15) as String;
The number of subscripts in arrays may vary.Eg.FORTRAN I: Limited to 3 dimensions only
FORTRAN IV: Up to 7 dimensions
Contemporary Language : no limitation
7/31/2019 Manual on Programming Languages
23/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
23
Array InitializationFortran 77:
INTEGER LIST (3)DATA LIST /O, 5, 5/
ANSI C/C++:int list[] = {4, 5, 7, 83};char name[] = Freddie;char *names[] = {Jo, Bob, Jake, Darcie};
Ada:LIST : array (1..5) of INTEGER := (1, 3, 5, 7, 9);BUNCH : array (1..5) of INTEGER := (1=>3, 3=>4, others=>0);
Slices- A slice of an array is some substructure of that array
See example on Page 213
Implementation of array types
- Requires more compile-time effort that simple built-in data types- The code to allow accessing of array elements must be generated
at compile time.
- At runtime, this code must be executed to produce elementaddresses
- Two ways to map multi-dimensional arrays to one-dimensional:1. Row Major Order2. Column Major Order
- The compile-time descriptor for single-dimensional arrays- The information in the descriptor is required to construct the
access function.
Record TypesA record is a possibly heterogeneous aggregate of data elements in which the
individual elements are identified by names
Records vs. Arrays
- heterogeneous - homogeneous
- fields are named w/ identifiers - referenced by index- allow to include unions
ARRAY
Element type
Index Type
Number of dimensions
Index
7/31/2019 Manual on Programming Languages
24/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
24
Pascal:
empRec = recordfn, mi, ln : string[30];dept : string[2];
endvar emp: empRec;begin
writeln(emp.dept);end.
C:
stypedef struct: empRec{
char[30] fn, mi, ln;char[2] dept;
}
void main(void)
{empRec emp;printf(%s, emp.dept);
}
To reference an element: Dot notation (Pascal and C/C++) % notation (Fortran)
Fully qualified reference all intermediate record names form thelargest enclosing record to the specified field are named in the reference.
employee.name := Bob;
employee.age := 42;employee.sex := Memployee.salary := 23750.00;
Elliptical reference - record names can be omitted.
with employee dobegin
name := Bob;age := 42;sex := M;salary := 23750.0;
end; {end of with}
7/31/2019 Manual on Programming Languages
25/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
25
Implementation of Records:
Field 1
Field n
Union TypesA union is a type that may store different type values at different times
during program execution
Design Issues: Should type checking be required? Should unions be embedded in records?
FORTRAN Union Types (EQUIVALENCE)INTEGER XREAL YEQUIVALENCE (X, Y)
- X and Y are to cohabit the same storage location- X and Y are aliases- No type checking is done
ALGOL 68 Union Types
- The current type value could be detected during runtime- Discriminated Union uses a tag or discriminant- Tag/discriminant identifies the current type value stored
union (int, real) ir1, ir2
union (int, real) ir1;int count;. . .ir1 := 33;. . .count := ir1;
- Conformity clauses solves the problem type checking forunion types
union (int, real) ir1;int count;real sum;. . .
Record
name
t e
offset
name
t e
offset
address
- fields of records are storedin adjacent memorylocations
-field accesses are allhandled using the offsets
The first assignment is legal, but the
second is not because the system cannotstatically check the type ofir1.
7/31/2019 Manual on Programming Languages
26/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
26
case ir1 in(int intval) : count := intval,(real realval) : sum := realval
esac
PASCAL Union Types-
Union is integrated with a record structure- Uses tag or discriminant- Called a Records Variant
type shape = (circle, triangle, rectangle);object = record
case form : shape ofcircle: (diameter : real)triangle: (leftside : integer;
rightside : integer;angle : real);
rectangle: (side1 : integer;side2 : integer)
end;
var figure : object;
Discriminant (form)
Problem: user program can change the tag without making the
corresponding change in the variant.
Eg.tag := circle;figure.side1 := 25;
ADA Union Types- The tag cannot be changed without making the corresponding
change in the variant.
- Checking the tag is required for all references to variants.- Constrained variant variable storing only 1 possible type
values in the variant thus allowing static type checking. Tag is treated as named constant
- Unconstrained variant variable values of the variant can bechanged during execution, however, the whole record should be
changed including the tag
Circle:diameter
Rectangle: side1, side2
Triangle: leftside, rightside, angle
7/31/2019 Manual on Programming Languages
27/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
27
type SHAPE is (CIRCLE, TRIANGLE, RECTANGLE);type OBJECT (FORM : SHAPE) is record
case FORM iswhen CIRCLE => DIAMETER : FLOAT;when TRIANGLE => LEFT_SIDE : INTEGER;
RIGHT_SIDE : INTEGER;
ANGLE : FLOAT;when RECTANGLE => SIDE_1 : INTEGER;SIDE_2 : INTEGER;
end case;end record;
- FIGURE_1 : OBJECT; // unconstrained no initial values- FIGURE_2 := OBJECT (FORM => TRIANGLE); // constrained
Set TypesA set is one whose variables can store unordered collections of distinct
values from some ordinal type called its base type.
Set types are often used to model mathematical sets.
Sets in Pascal and Modula-2- represent sets as bit string that fit into a single machine word.- Set operations:
Set union Set intersection Set difference Set equality
type colors =(red, blue, green, yellow, orange, white, black);colorset = set of colors;
var set1, set2 : colorset;
Constant values can be assigned to the set variables set1 and set2 as in
set1 := [red, blue, yellow, white];set2 := [black, blue];
- Set types are usually stored as bit strings in memory- Present element set to 1 (set bit)- Absent element set to 0 (clear bit)
Set Operations:
type chars = a. . . g
charset = set of chars
var set1, set2 : charsetset3 : charset
beginset1 = [a, c, f, e] // 1010110set2 = [a, b, c, g] // 1110001
7/31/2019 Manual on Programming Languages
28/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
28
1. Union: set1 U set2set3 := set1 + set2;set3
7/31/2019 Manual on Programming Languages
29/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
29
2 fundamental pointer operations:
1. Assignment set a pointer variable to the address of some object2. Dereferencing allows a pointer to be followed to the data object to
which it points.
2 Problems that can be encountered when performing pointer operations:
1. Dangling Pointers or dangling reference2. Garbage (lost object)- In most languages, pointers are used in heap management
2 types of heap elements:1. Fixed-size allocation heap
- All heap storage are allocated and deallocated in units of a singlesize
- All cells are linked together using the pointers in the cells,forming the free space list.
- Allocation depends on the next available space]- A dynamic variable can be pointed out by more than one pointer,
making it impossible to determine when the variable is no longeruseful to the program.
- Creation of a collection of cells that are no longer accessible andshould be deallocated is also possible
Ex.
var p, q, r : ^integer; i: integer;begin
new(p); p^ := 4;new(q) q^ := 5;new(p); p^ := 3;q := p;dispose(p);new(r); r^ := 5;q^ := 0;new(p); p^ := 5;i := p^/r^;
end;
Solutions to the Dangling Pointer Problem1. Use of Tombstones
- The actual pointer variable points only to tombstonesand never to dynamic variables.
- When a dynamic variable is deallocated, the tombstoneremains but is set to nil, indicating that the dynamicvariable no longer exists.
Tombstone Dynamic Variable
7/31/2019 Manual on Programming Languages
30/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
30
2. Locks-and-Keys- Pointer values are represented as ordered pairs, where
the key is an integer value.- Dynamic variables are represented as storage for the
variable plus a header cell that stores an integer lockvalue.
Ways to reclaim garbage
1. Reference Counters (eager approach)- Reclamation is incremental and is done when
inaccessible cells are created.
2. Garbage Collection (lazy approach)- Reclamation only occurs when the list of available space
becomes empty.
Abstract Data TypesData Abstraction
An Abstract Data Type is defined as:
1. A set ofdata objects, ordinarily using one or more type definitions2. A set ofabstract operations on those data objects and3. Encapsulation of the whole in such a way that the user of the new
type cannot manipulate data objects of the operations defined.
Basic Terminologies:
Information Hiding - clients cannot change the underlying representation ofobjects directly.
Type Definitions defines the structure of a data object with its possiblevalue bindings.
Example:
class my_stack{private
int top, element[n];public
my_stack();void pop(int *item);void push(int item);void s_top();int s_empty();
};
my_stack::my_stack(){top = 0}void my_stack::pop(int *item){
if (top==0) printf(Stack Empty!);else {*item = element[top];
top--;}
}
7/31/2019 Manual on Programming Languages
31/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
31
void my_stack::push(int item){if(top==(n-1)) printf(Stack Full!);
else { element[top]=item;top++;
}}int my_stack::s_empty(){if(top==0) return 1;
else return 0;}
void my_stack::s_top(){if(top==0)printf(Stack Empty);
else printf(%d\n, element[top-1]);}
class my_stack
void pop(int *item);void push(int item);void s_top();int s_empty();
Name of the abstract data object.Creation of an object of type
my_stack is similar to declaring an
ordinary variableMy_stack S1, S2;
These are the methods or functionsdeclared inside my_stack. These
methods can only be accessed bymy_stack objects, making itencapsulated.
7/31/2019 Manual on Programming Languages
32/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
32
SYNTAX AND SEMANTICS
4.1 Introduction
Programming language implementers must be able to determine how theexpressions, statement, and program units of a language are formed, and also their
intended effect when executed.
Syntax is the form of its expressions, statements, and program units.Semantics is the meaning of those expressions, statements, and program units.
4.2 Syntax
A language is the set of strings of characters from some alphabet
A sentence or statement is the strings of a language.
The syntax rules of a language specify which strings of characters from the
languages alphabet are in the language.
A lexeme is a small syntactic unit of a language.
A program is a string of lexemes.
A token of a language is a category of its lexeme.
Example: C statement:
index = 2 * count + 17;
Lexemes Tokensindex identifier= equal_sign2 int_constant* mult_opcount indentifier+ plus_op17 int_constant; semicolon
4.3 Formal Methods of Describing Syntax
BACKUS-NAUR FORM (BNF)- Used to specify programming language syntax- A metalanguage (a language that is used to describe another language)Four components:
- set of production rules or grammar- set of nonterminal symbols- set of terminal symbols- start symbol
7/31/2019 Manual on Programming Languages
33/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
33
LHS RHS
Example:
=
- The symbol on the left side of the arrow, which is aptly called the left-hand side (LHS), is the abstraction being defined.- The text to the right of the arrow is the definition of the LHS. It is called
the right-hand side (RHS) and consists of some mixture of tokens,lexemes, and references to other abstractions.
- Altogether, the definition is called a rule, or production.- The abstraction in a BNF description, or grammar, are often called
nonterminal symbols, or simply nonterminals.- The lexemes and tokens of the rules are called terminal symbols, or
simply terminals.
- A grammar is simply a collection of rules.- Nonterminal symbols can have two or more distinct definitions,
representing two or more possible syntactic forms in the language,
separated by the symbol |, meaning logical OR.
Example: PASCAL if
if then if then else
or with the rule
if then
| if then else
Given the following statement, write the corresponding BNF.1. var a: integer;
var : ; a | b | c
integer | char | string | real
2. int a; ;
int | float | char
Accepted vs. Rejected Sentences- A string or sentence is accepted if it is part of the languages alphabet- 2 ways to check if a string is accepted:
1. DERIVATION2. PARSE TREES
Derivation- It is the process of generating the valid or accepted sentences of a
language by applying a sequence of the production rules, beginning withthe start symbol.
- 2 types:1. Leftmost Derivation always replace the leftmost nonterminal2. Rightmost Derivation always replace the rightmost nonterminal
7/31/2019 Manual on Programming Languages
34/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
34
- A string is accepted if at the end of the derivation only terminal symbolswere left; otherwise, the string is rejected.
- Example: assignment begin end
| ; :=
A | B | C
+
| -
|
- A derivation of a program in this language follows: => begin end
=> begin ; end=> begin := ; end
=> begin A := ; end
=> begin A := + ; end=> begin A := B + ; end
=> begin A := B + C; end
=> begin A := B + C; end=> begin A := B + C; := end=> begin A := B + C; B := end
=> begin A := B + C; B := end
=> begin A := B + C; B := C end
- Another example: Simple assignment statements :=
A | B | C +
| * | ()
|
A := B * (A + C)
Is generated by the leftmost derivations:
=> := => A :=
=> A := *
=> A := B *
=> A := B * ()=> A := B * ( + )=> A := B * (A + )
=> A := B * (A + )
=> A := B * (A + C)Therefore, the statement is accepted.
Will the statement, B := A * (A * B + C) be accepted or rejected?
7/31/2019 Manual on Programming Languages
35/36
Programming Languages
UNIVERSITY OF THE CORDILLERAS
35
PARSE TREE
- A hierarchical syntactic structure. It pictorially shows how the start symbolof a grammar derives a string in the language.
Root node start symbol Interior nodes nonterminal symbols Leaf nodes token or terminal symbolsNote:
If a grammar that generates a sentence for which there are two or moredistinct parse trees is said to be ambiguous, therefore has more than one
meaning that may cause misinterpretation.
- Example: assignment begin end
| ; := A | B | C
+ | -
|
Parse Tree Representation: begin A:= B + C; B := C end
begin end
;
:=
A + :=
B C B
C
Exercise:S aAS
| aA SbA
| SS| ba
7/31/2019 Manual on Programming Languages
36/36
Programming Languages 36
Determine whether the following statement is ACCEPTED or REJECTED
1. aabbaa
2. abba3. abaabaa
EXTENDED BNF (EBNF)- Increases readability and writability
Meta symbols or notations used:
| Multiple Choice (used to represent alternative definitions)[ ] Optional Part
{ }* Repeated zero or more times
{ }+ Repeated one or more times
Example:
BNF: +
| -
| *
| / |
EBNF: {(+ | -) }*
{(* | /) }*
Given the C variable declaration, write the corresponding EBNF
int A;int A, B;char C, D, E;
Given the if-then-else statement in ADA, write the corresponding EBNF
1. if then ;2. if then ;
else ;3. if then ;
elsif then ;else ;
4. if then ;elsif then ;
elsif then ;
else ;