Manual on Programming Languages

Embed Size (px)

Citation preview

  • 7/31/2019 Manual on Programming Languages

    1/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    1

    PRELIMINARIES

    1.1 Reasons for Studying Concepts of Programming Languages Increased capacity to express ideas.

    - It is difficult for people to conceptualize structures that they cannot describe,verbally or in writing.- Awareness of a wider variety of programming language features can reduce

    limitations in software development.

    - Programmers can increase the range of their software-development thoughtprocesses by learning new language constructs.

    - Builds an appreciation for valuable language features and encouragesprogrammers to use these features.

    Improved background for choosing appropriate languages.- Many professional programmers have had little formal education in computer

    science and were trained on the job or through in-house training programs.

    - Many other programmers received their formal training in the early days ofcomputer science education, when few languages were not widely known.

    - The result of this narrow background is that many programmers, when givena choice of languages for a new project, continue to use the language withwhich they are most familiar, even if it is poorly suited to the new project.

    - If these programmers were familiar with the other languages, they would bein a better position to make informed language choices.

    Increased ability to learn new languages.- Computer programming is a young discipline, and design methodologies,

    software development tools, and programming languages are still in a state of

    continuous evolution.- The process of learning a new programming language can be lengthy and

    difficult, especially for someone who is comfortable with only one or twolanguages and has never examined programming language concepts in

    general.- Once a thorough understanding of the fundamental concepts of languages is

    acquired, it becomes far easier to see how these concepts are incorporated

    into the design of the language being learned.

    - It is essential that practicing programmers know the vocabulary andfundamental concepts of programming languages so they can read andunderstand programming language manuals and sales literature for languages

    and compilers.

    Better understanding of the significance of implementation.- Allows us to visualize how a computer executes various language constructs.

    Understand relative efficiency of alternative constructs that may be chosen for

    a program.

    - This in turn leads to the ability to use a language more intelligently, as it wasdesigned to be used.

    - We can become better programmers by understanding the choices amongprogramming language constructs and the consequences of those choices.

  • 7/31/2019 Manual on Programming Languages

    2/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    2

    - Certain kinds of program bugs can only be found and fixed by a programmerwho knows some related implementation details.

    Increased ability to design new languages- To a student, the possibility of being required at some future time to design a

    new programming language may seem remote.- However, most professional programmers occasionally do design languages of

    one sort or another.

    Overall advancement of computing.- Finally, there is a global view of computing that can justify the study of

    programming language concepts.- Although it is usually possible to determine why a particular programming

    language became popular, it is not always clear, at least in retrospect, thatthe most popular languages are the best available.

    - In some cases, it might be concluded that a language become widely used, atleast in part, because those in positions to choose languages were not

    sufficiently familiar with programming language concepts.- In general, if those who choose languages are better informed, better

    languages will more quickly squeeze out poorer ones.

    1.2 Programming Domains Scientific Applications

    - Typically, scientific applications have simple data structures but require largenumbers of floating-point arithmetic computations.

    - For some scientific applications where efficiency is the primary concern, likethose that were common in the 1950s and 1960s, no subsequent language issignificantly better than FORTRAN.

    Business Applications- The use of computers for business applications began in the 1950s.- The first successful high-level language for business was COBOL which

    appeared in 1960.

    - Business languages are characterized, according to the needs of theapplication, by elaborate input and output facilities and decimal data types.

    - With the advent of microcomputers came new ways for businesses, especiallysmall businesses, to use computers. Two specific tools, spreadsheet systemsand database systems, were developed for business and now are widely used.

    Artificial Intelligence- AI is a broad area of computer applications characterized by the absence of

    exact algorithms and the use of symbolic computations rather than numeric

    computation.- Symbolic computation means that symbols, consisting of names rather than

    numbers, are manipulated.

  • 7/31/2019 Manual on Programming Languages

    3/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    3

    - The first widely used programming language developed for AI applicationswas the functional language LISP (Scheme) which appeared in 1959.

    - An alternative approach to these applications appeared in the early 1970s:logic programming using Prolog language.

    Systems Programming Languages- The operating system and all of the programming support tools of a computer

    system are collectively known as its systems software.- Systems software is used almost continuously and therefore must have

    execution efficiency.- A language for this domain must have low-level features that allow the

    software to external devices to be written.

    - In the 1960s and 1970s, some computer manufacturers, such as IBM,Digital, and Burroughs (now UNISYS) developed special machine-orientedhigh-level languages for systems software on their machines. For IBMmainframe computers, the language was PL/S, a dialect of PL/I; for Digital, it

    was BLISS, a language at a level just above assembly language; for

    Burrougs, it was Extended ALGOL.

    - The UNIX operating system is written almost entirely in C, which was made itrelatively easy to port, or move, to different machines.

    Very High-Level Languages (VHLLs)- The languages in the category called very high-level have evolved slowly over

    the past 25 years.- The various scripting languages for UNIX are examples of VHLLs. A scripting

    language is one that is used by putting a list of commands, called a script, in

    a file to be executed.- The first of these languages, named shell, began as a small collection of

    commands that were interpreted to be calls to system subprograms that

    performed utility functions, such as file management and simple file filtering.- Other VHLLs are awk, for report generation; tcl combined with tk, which

    provide a method for building X Window applications. The perl is acombination ofshell and awk.

    Special-Purpose Languages- A host of special-purpose languages have appeared over the past 40 years.- They range from RPG, which is used to produce business reports, to APT,

    which is used for instructing programmable machine tools, to GPSS, which isused for systems simulation.

  • 7/31/2019 Manual on Programming Languages

    4/36

  • 7/31/2019 Manual on Programming Languages

    5/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    5

    Where Reg1 and Reg2 are registers. The semantics are:

    Reg1 contents(Reg1) + contents(memory_cell)

    Reg1 contents(Reg1) + contents(Reg2)

    VAX superminicomputers (orthogonal)ADDL operand_1, operand_2

    Where the semantics is:Operand_2 contents(operand_1) + contents(operand_2)

    ALGOL 60 too orthogonal

    ex. Record + array (no restrictions on the types)

    if A == B = x+9 (condition, assignment & arithmetic)

    Control Statements- Facilities to transfer control of the program execution from one

    program part to another.- Indiscriminate use of goto statements severely reduces program

    readability.

    Remedy when using goto:

    1. They must precede their targets, except when used for loops2. Their targets must never be too distant3. Their numbers must be limited

    - Control statement deign of a language can be important factor in thereadability of programs written in that language.

    Data Types and Structures- The presence of adequate facilities for defining data types and data

    structures in a language is another significant aid to readability.Ex. Sum_is_too_big := 1 (no Boolean types)

    Sum_is_too_big := true

    Record data type vs. collection of similar arrays

    Syntax Considerations- The syntax, or form, of the elements of a language has a significant effect

    on the readability of programs. The following are three examples of

    syntactic design choices that affect readability:

    1. Identifier forms- length of the identifier (names)- case sensitivity

    - presence of connectors2. Special words

    - delimiters

    - short-circuit evaluation

    3. Form and meaning- the meaning has to agree/follow the form or syntax.

    Writability- A measure of how easily a language can be used to create programs for a

    chosen problem domain.- Most of the language characteristics that affect readability also affect

    writability.

  • 7/31/2019 Manual on Programming Languages

    6/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    6

    Simplicity and Orthogonality- a smaller number of primitive constructs and a consistent set of rules for

    combining them (that is, orthogonality) is much better than simply havinga large number of primitives.

    - A programmer can design solution to a complex problem after learningonly a simple set of primitive constructs.

    - BUT too much orthogonality can be a detriment to writability. Errors inwriting programs can go undetected when nearly any combination ofprimitives is legal. Leading to misuse or disuse.

    Support for Abstractions- Hiding the details of implementation- The degree of abstraction allowed by a programming language and the

    naturalness of its expression are very important to its writability.- Programming languages can support two distinct categories of abstraction

    1. Data AbstractionEx. Binary tree

    2. Process AbstractionEx. subprograms

    Expressivity- There are very powerful operators that allow great deal of computation to

    be accomplished with a very small program.- A language has relatively convenient, rather than cumbersome, ways of

    specifying computations.

    Reliability A program is said to be reliable if it performs to its specifications under all

    conditions.1. Type Checking

    - Testing for type errors in a given program, either by the compiler orduring program execution.Ex. Type compatibility b/n 2 variables.

    - The earlier errors in programs are detected, the less expensive it is tomake the required repairs.

    - Consider space, time and accuracy2. Exception Handling

    - The ability of a program to intercept run-time errors (as well as otherunusual conditions detected by the program), take correctivemeasures, and then continue execution.

    3. Aliasing- having two distinct referencing methods, or names, for the same

    memory cell.- It is now widely accepted that aliasing is a dangerous feature of

    programming language.

    4. Readability and Writability- The easier a program is to write, the more l ikely it is to be correct.- Programs that are difficult to read are difficult both to write and to

    modify.

  • 7/31/2019 Manual on Programming Languages

    7/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    7

    Cost Cost of training programmers to use the language Cost of writing programs in the language Both costs of training and writing programs Cost of compiling programs Cost of executing programs Cost of compilers Cost of poor reliability Cost of maintaining programs Optimization is the name given to collection of methods that compilers may

    use to decrease the size and/or increase the execution speed.

    Criteria for evaluation:1. Portability

    - ease with which programs can be moved from one implementation toanother. (Standardization)

    2. Generality- the applicability to a wide range of applications.

    3. Well-definedness- the completeness and precision of the languages official defining

    document

    1.4 Influences on Language Design Computer Architecture

    - The basic architecture of computers has a crucial effect on the languagedesign. Most of the popular languages of the past 35 years have been

    designed around the prevalent computer architecture, called the VonNeumann architecture. These languages are called imperative languages.In a von Neumann computer, both data and programs are stored in the same

    memory. The CPU executes instructions, is separated from the memory.- Central features of imperative language are:

    1. Variables which models the memory cells2. Assignment statements piping operations3. Iteration construct repetition

    Programming Methodologies Data-Oriented

    - Simply put, data-oriented methods emphasize data design, concentratingon the use of logical, or abstract, data types to solve problems.

    - Objected-oriented methodology begins with data abstraction, whichencapsulates processing with data objects and hides access to data, and

    adds inheritance and dynamic type bindings. Inheritance is a powerfulconcept that greatly enhances the possibility of reuse of existing software.

    Reuse of software components promises to significantly increase softwaredevelopment productivity.

    Process-Oriented- Opposite of data-oriented programming.- Focuses on concurrency

  • 7/31/2019 Manual on Programming Languages

    8/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    8

    1.5 Language Design Trade-offs The task of choosing constructs and features when designing a programming

    language involves a collection of compromises and trade-offs

    -Conflicting criteria:1. Reliability vs. Cost of execution2. Writability vs. Readability3. Flexibility vs. Safety

    1.6 Implementation Methods Compilation translates programs from some high-level instructions to machine

    language, which can be executed directly on the computer.

    Lexical units

    Parse trees

    Intermediate code

    Machine language

    Input Data

    Results

    SourceProgram

    Lexicalanalyzer

    Syntaxanalyzer

    IntermediateCode generator(and semantic

    analyzer)

    Codegenerator

    Computer

    Symboltable

    Optimization(optional)

  • 7/31/2019 Manual on Programming Languages

    9/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    9

    Pure Interpreter programs are interpreted by another program (called theinterpreter) without going through any form of translation.

    Input data

    Results

    Hybrid Implementation Systems translates high-level language programs toan intermediate language designed to allow easy interpretation.

    Lexical units

    Parse trees

    Intermediate code

    Input Data

    Results

    SourceProgram

    Interpreter

    SourceProgram

    Lexicalanalyzer

    Syntaxanalyzer

    IntermediateCode generator

    Interpreter

  • 7/31/2019 Manual on Programming Languages

    10/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    10

    1.7 Programming Environments The collection of tools used in the development of software

    1. File system secondary memory2. Text editor w/ debugger, optimizer3. Linker preliminary step in the completion of the result4. Compiler large collection of integrated tools

    1.8 Programming Paradigms A Programming Paradigm is a problem-solving approach

    FORTRAN Experimental Applied to Spreadsheet PackagesALGOL PROLOG queryCOBOL languagesBASIC 4GLsPascalCModula-2

    ADA

    LISP APL SNOBOL PROLOG VBLogo PERL Level 5 C++Scheme Java

    Programming Language

    Process Oriented Data Oriented

    Imperative Data Flow Functional Constraint Rule Object Database

    ListProcessing

    ArrayProcessing

    StringProcessing

    ProductionSystem

    Logic AccessOriented

    ObjectOriented

  • 7/31/2019 Manual on Programming Languages

    11/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    11

    NAMES, BINDINGS, TYPE CHECKING, ANS SCOPES

    2.1 Names Associated with variables, labels, subprograms and formal parameters Design Issues:

    1. What is the maximum length of a name?2. Can connector characters be used in names?3. Are names case sensitive?4. Are the special words reserved words or keywords?

    Special Words- Used to make programs more readable- Used to separate the syntactic entities of programs- Keyword is a word in PL that is special only in certain context

    Ex. PASCAL

    var true:integer;flag:boolean;

    beginflag := true;

    true := 1;end;

    - Reserved word a special word that cannot be used as a nameEx. C

    int float=2;

    /*you cannot use void as a variable name since it is a reserved word

    that signifies a data type */- Predefined names names that have predefined meaning but can be

    redefined by the user. Must be visible to the compiler when used.

    Ex. C

    clrscr();

    PASCAL

    writeln(); readln();

    2.2 Variables An abstraction of the computer memory cell or collection of cells. A variable can be characterized as a sextuple of attributes:

    1. Name as discussed in 1.1. often referred to as identifiers.2. Address is the memory address with which it is associated3. Value is the contents of the memory cell or cells associated with it.4. Type determines the range of values the variable can have and set of

    operations that are defined for values of the type.

    5. Lifetime is the time during which the variable is bound to a specific memorylocation6. Scope The scope of such a variable is from its declaration to the end

    reserved word of the procedure.

  • 7/31/2019 Manual on Programming Languages

    12/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    12

    2.3 The Concept of Binding Binding is an association, such as between an attribute and an entity or between

    an operation and a symbol.

    Binding Time is the time at which a binding takes place. Bindings can take place at language design time, language implementation time,

    compile time, link time, load time, or run time.

    Ex. Cint count;. . .count = count +5;

    Some of the bindings and their binding times for the parts of this assignment

    statement are as follows:

    Set of possible types for count: bound at language design time. Type of count: bound at compile time. Set of possible values of count: bound at compiler design time. Value of count: bound at execution time with this statement. Set of possible meanings for the operator symbol +: bound at language

    definition time Meaning of the operator symbol +: bound at compile time. Internal representation of the literal 5: bound at compiler design time.

    Binding of Attributes to Variables- A binding is static if it occurs before run time and remains unchanged

    throughout program execution.- A binding is dynamic if it occurs during run time or can change in the course

    of program execution.

    Type Bindings- Before a variable can be referenced in a program, it must be bound to a data

    type.- The two importance aspects of this binding are how the type is specified and

    when the binding takes place.- Types can be specified statically through some form of explicit or implicit

    declaration.

    Explicit declaration is a statement in a program that lists variable namesand declares them to be of a particular type.

    Implicit declaration is a means of associating variables with typesthrough default conventions instead of declaration statements.

    Dynamic Type Binding- The type is not specified by a declaration statement.- The variable is bound to a type when it is assigned a value in an assignmentstatement.- When the assignment statement is executed, the variable being assigned is

    bound to the type of the value, variable or expression on the right side of theassignment.

    - The primary advantage of dynamic binding of variables to types is that itprovides a great deal of programming flexibility.

  • 7/31/2019 Manual on Programming Languages

    13/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    13

    Ex. SNOBOL4

    LIST 10.2 5.1 0.0 (causes LIST to be 1 dimensional array)

    LIST 47 (causes LIST to be integer variable)

    - Are often implemented using interpreters rather than compilers. This ispartially because it is difficult to change dynamically the types of variables in

    machine code.

    - There are two disadvantage of dynamic type binding.1. Error detection capability of the computer is diminished relative to a

    compiler for a language with static type bindings, because any two types

    can appear on opposite sides of the assignment operator.2. The cost of implementing dynamic attribute binding is considerable,

    particularly in execution time. Type checking must be done at run time.

    Furthermore, every variable must have a descriptor associated with it tomaintain the current type. The descriptors must also be of varying sizebecause more space is needed if the variable is a structured type than if it

    is a primitive type.

    Type Inference- Inferencing mechanism, in which the types of most expressions can be

    determined without requiring the programmer to specify the types of thevariables.Ex. fun circumf (r) = 3.14159 * r * r;

    - a function that takes real argument and produces real result.

    fun times10 (x) = 10 * x;

    - argument and functional value are inferred to be type integer.

    fun square (x) = x * x;

    - cannot be inferred. Instead explicitly define as:

    fun square (x) : int = x * x;fun square (x : int) = x * x;

    fun square (x) = (x : int) * x;

    fun square (x) = x * (x : int);

    - Type inference is also used in the purely functional language. Storage Bindings and Lifetime

    - Allocation is the process where memory cell to which a variable is boundmust be somehow taken from a pool of available memory.

    - Deallocation is the process of placing a memory cell that has been unboundfrom a variable back into the pool of available memory.

    - Lifetime of a program variable is the time during which the variable is boundto a specific memory location. So the lifetime of a variable begins when it isbound to a specific cell and ends when it is unbound from that cell.- Static Variables

    - Those that are bound to memory cells before program executionbegins and remain bound to those same memory cells until programexecution terminates.

    - The greatest advantage of static variable is efficiency. All addressing ofstatic variables can be direct.

  • 7/31/2019 Manual on Programming Languages

    14/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    14

    - No run-time overhead is incurred for allocation and deallocation.- Disadvantage of static binding to storage is reduced flexibility; in

    particular, in a language that has only variables that are staticallybound to storage, recursive subprograms are not supported.

    - Static-Dynamic Variables- Those whose storage bindings are created when their declaration

    statements are elaborated, but whose types are statically bound.- Elaboration of such a declaration refers to the storage allocation and

    binding process indicated by the declaration, which takes place whenexecution reaches the code to which the declaration is attached.

    - Elaboration occurs during run time.- In C, local variables are by default stack-dynamic but can be made

    static by including the static qualifier to their definitions.

    - Explicit Heap-Dynamic Variables- Are nameless objects whose storage is allocated and deallocated by

    explicit run-time instructions specified by the programmer. Thesevariables, which are allocated from and deallocated to the heap, can

    only be referenced through pointer variables.- An example using C++ code segment

    int *intnode;. . .

    intnode = new int; // allocates an int object. . .delete intnode; // deallocates object to w/c intnode points

    - Explicit heap-dynamic variables are often used for dynamic structures,such as linked lists and trees, that need to grow and shrink duringexecution.

    - The disadvantages of such variables are the difficulty of using themcorrectly and the cost of references, allocations and deallocations.

    - Implicit Dynamic Variables- Are bound to heap storage only when they are assigned values. In

    fact, all their attributes are bound every time they are assigned.- The advantage of such variables is they have the highest degree of

    flexibility, allowing highly generic code to be written.- The disadvantage is the run-time overhead of maintaining all the

    dynamic attributes, which could include array subscript types andranges, among others. Another is the loss of some error detection bycompiler.

    2.4 Type Checking- Is the activity of ensuring that the operands of an operator are of compatible

    types.

    - A compatible type is one that is either legal for the operator or is allowedunder language rules to be implicitly converted by compiler-generated code to

    a legal type.- This automatic conversion is called coercion.- A type error is the application of an operator to an operand of an

    inappropriate type.- If all bindings of variables to types are static in a language, then type

    checking can nearly always be done statically.

  • 7/31/2019 Manual on Programming Languages

    15/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    15

    - Dynamic type binding requires type checking at run time, which is calleddynamic type checking.

    2.5 Strong Typing- A strongly typed language is one in which each name in a program in thelanguage has a single type associated with it, and that type is known at

    compile time.- All types are statically bound.- A PL is said to be strongly typed if type errors are always detected.- The importance of strong typing lies in its ability to detect all misuses of

    variables that result in type errors.

    - A strongly typed language also allows detection, at run-time, of uses of theincorrect type values in variables that can store values of more that one type.

    2.6 Type Compatibility- Name type compatibility means that two variables have compatible types

    only if they are either the same declaration or in declarations that use the

    same type name.- Structure type compatibility means that two variables have compatible

    types if their types have identical structures.

    2.7 Scope- The scope of a program variable is the range of statements in which the

    variable is visible.- A variable is visible in a statement if it can be referenced in that statement.- A variable is Local in a program unit or block if it is declared there.- The Nonlocal variables of a program unit or block are those that are visible

    within the program unit or block but are not declared there.

    Static Scope- Scope of a variable can be statically determined, that is, prior to execution- See example in page 172

    Blocks- Section of code

    Dynamic Scope- Is based on the calling sequence of subprograms, not on their spatial

    relationship to each other.- The scope can be determined only at run time.- See example in Page 177.

    Evaluation of Dynamic Scoping- The correct attributes of non-local variables visible to a program statement

    cannot be determined statically.- Several kinds of programming problems follow directly from dynamic

    scooping.- Dynamic scooping results in less reliable programs than static scooping- Inability to statically type check references to non-locals.

  • 7/31/2019 Manual on Programming Languages

    16/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    16

    - Dynamic scooping also makes programs much more difficult to read, becausethe calling sequence of subprograms must be known to determine the

    meaning of references to non-local variables.- On the other hand, dynamic scooping can be used to advantage in

    programming. Subprograms inherit the context of their callers

    2.8 Scope and Lifetime- Relation: The scope of a variable is from its declaration to the end reserved

    word of the procedure. The lifetime of that variable is the period of time

    beginning when the procedure is entered and ending when execution of theprocedure reaches the end.

    2.9 Referencing Environments- The referencing environment of a statement is the collection of all names that

    are visible in the statement.

    - The referencing environment of a statement in a static-scoped language is thevariables declared in its local scope plus the collection of all variables of its

    ancestors scopes that are visible. (see pages 180-181 for examples.)

    2.10 Named Constants- A named constant is a variable that is bound to a value only at the time it is

    bound to storage; its value cannot be changed by assignment or by an input

    statement.- Named constants are useful as aids to readability and program reliability.- Readability can be improved, for example, by using the name pi instead of

    the constant 3.14159.

    2.11 Variable Initialization- It is convenient for variables to have values before the code of the program

    or subprogram in which they are declared begins executing.- The binding of a variable to a value at the time it is bound to storage is called

    initialization.

    - If the variable is statically bound to storage, binding and initialization occurbefore run time.

    - If the storage binding is dynamic, initialization is also dynamic.

  • 7/31/2019 Manual on Programming Languages

    17/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    17

    DATA TYPES

    3.1 Introduction

    Computer programs produce results by manipulating data. An important factor in

    determining the ease with which they can perform this task is how well the datatypes match the real-world problem space. It is crucial, therefore, that a language

    supports the proper variety of data types and structures.

    3.2 Primitive Data Types

    The data types that are not defined in terms of other types are called primitive data

    types. Nearly all-programming languages provide set of primitive data types.

    The primitive data types of a language are used, along with one or more type

    constructors, to provide the structured types.

    a. Numeric Types Integer

    - The most common primitive data type- Represented in a computer by a string of bits with one of the bits

    representing the sign.- Implementations:

    Sign bit Binary Integer

    Type Sign bit Binary integerDescriptor

    Type Sign Binary

    Descriptor bit integer

    Floating Point- Model real numbers but the representation are only

    approximation for most real numbers

    - Have value ranges that are defined in terms of precision andrange. (Ex. , e)

    - Problem: Loss of accuracy through arithmetic operations- Implementations:

    Single precision8 bits 23 bits

    I

    I

    ex onent fractionsb

  • 7/31/2019 Manual on Programming Languages

    18/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    18

    Double precision11 bits 52 bits

    Decimal- Store a fixed number of decimal digits, with the decimal point at

    a fixed position in the value.- Uses the binary coded decimal (BCD) representation- Ex. PL/I:

    DECLARE X FIXED DECIMAL(10,3)

    COBOL:X PICTURE 999V99

    Boolean Types- Has only 2 elements / range (true or false)- Often used for switches or flags- Could be represented by a single bit but the smallest addressable

    unit is normally used

    Character Types- Stored as numeric codings- Uses ASCII representation- JAVA uses the UNICODE representation

    3.3 Structured Data Types

    Character String Types-

    One in which the object consist of sequences of characters- Design issues:

    Should string be primitive type or simply a special kind ofcharacter array?

    Should string have static or dynamic length?- String Length Options:

    Static length stringEx:

    A:String[20] (Pascal)Character (len=15) Name1, Name2 (Fortran)

    Implementation:

    - Require compile-time descriptor with field for length

    exponent fractionsb

    Static string

    Length

    Address

  • 7/31/2019 Manual on Programming Languages

    19/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    19

    Limited dynamic lengtho Allow string to have varying length up to a

    declared and fixed maximum set by the variable

    definition.o Ex: char A[20];o Implementation:

    Requires runtime descriptor to store boththe fixed maximum length and currentlength

    Dynamic length stringo String have varying length with no maximumo Provides maximum flexibilityo Ex: Snobol4

    Newline = trim(input)

    o Implementation: Require a simpler runtime descriptor only

    the current length needs to be stored.

    User-Defined Ordinal Types- An ordinal type is one in which the range of possible values can

    be easily associated with the set of positive integers.

    Enumeration TypesAn enumeration type is one in which all of the possible values, which are symbolicconstants are enumerated in the definition.

    Ex. (Ada)Type DAYS is (Mon, Tue, Wed, Thu, Fri, Sat, Sun);

    Design Issues:

    Is a literal constant allowed to appear in more thanone type definition?

    And if so, how is the type of an occurrence of theliteral in the program checked?

    Limited dynamic strings

    Maximum length

    Current length

    Address

  • 7/31/2019 Manual on Programming Languages

    20/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    20

    Designs:

    Pascal: Not allowed to be used in more than one

    enumeration type definition Enumeration type variables can be used as

    - array subscript

    - for loop variables- case selector expressions

    Can be compared using relational operatorExample:

    type colortype = (red, blue, green, yellow);var color : colortype;. . .color := blue;if (color > red) . . .

    Ada:

    Literals are allowed to appear in more than onedeclaration in the same referencing environment.These are called overloaded literals.

    Example:

    type LETTERS is (A, B, C, D, E, F,G, H, I, J, K, L,M, N, O, P, Q, R,S, T, U, V, W, X,Y, Z);

    type VOWELS is (A, E, I, O, U);

    for LETTER in A..U loop (ambiguous)

    for LETTER in VOWELS(A)..VOWELS(U) loop

    Evaluation:

    Common operations for enumeration types are for predecessor, successor, positionin the list of values, and value for a given position number. In Pascal, these

    operations are provided by built-in functions. Example, pred(blue) is red. In Ada,

    they are attributes. For example, LETTERPRED(B) is A.

    Enumeration types provide greater readability in a very direct way: Named values

    are easily recognized, whereas coded values are not. Also provides type checking.

    Subrange TypesIs a contiguous subsequence of an ordinal type. For example,

    12..14 is a subrange of integer type.

  • 7/31/2019 Manual on Programming Languages

    21/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    21

    Evaluation:

    Subrange types enhance readability by making it clear toreaders that variables of subtypes can store only certain

    ranges of values.

    Reliability is increased with subrange types, becauseassigning a value to a subrange variable that is outside the

    specified range is detected as an error by the run-timesystem.

    - Implementation of User-Defined Ordinal TypesEnumeration types are usually implemented by associating a

    non-negative integer value with each symbolic constant in thetype.

    Typically, the first enumeration value is represented as 0, the

    second as 1, and so forth. As long as the association is constant,

    the integers can be used in place of the enumeration constants.Of course, the operation allowed is dramatically different from

    those of integers, except in the relational operators, which areidentical.

    In ANSI C and C++ enumeration types are often treated exactly

    like integers.

    Subrange types are implemented in exactly the same way as

    their parent types, except the range checks must be included inevery assignment. This increases code size and execution timebut is usually considered well worth the cost. Also, a good

    optimizing compiler can optimize some of the checking away.

    Array Types- An array is a homogeneous aggregate of data elements in which

    the individual element is identified by its position in theaggregate, relative to the first element.

    - Arrays are referenced by means of two-level syntacticmechanism: Aggregate name Subscripts and indexesSyntax. array_name[index] element

    Static binding binding of subscript type to an array variable

    Dynamic binding binding of subscript value ranges

  • 7/31/2019 Manual on Programming Languages

    22/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    22

    Four Categories of Arrays

    1. Static Array the subscript ranges are statically bound andstorage allocation is static (done before runtime). The advantageof static arrays is efficiency: No dynamic allocation or

    deallocation is requires.

    2. Fixed Stack-Dynamic Array the subscript ranges arestatically bound but the allocation is done at declarationelaboration time during execution. The advantage of fixed stack-dynamic arrays over static arrays is space efficiency. A large

    array in one procedure can use the same space as a large array

    in a different procedure, as long as both procedures are neveractive at the same time.

    Eg.

    A:array[1..10] of integer; (Pascal)

    int A[10]; (C/C++)

    3. Stack-Dynamic Array subscript ranges are dynamically boundand storage allocation is dynamic (Done during run time).

    Once the subscript ranges is bound and the storage is allocated, they remainfixed during the lifetime of the variable. Its major advantage over the latter is

    flexibility.

    Eg. Ada

    Get (LIST_LEN);declare

    LIST : array (1..LIST_LEN) of INTEGER;

    begin. . .

    end;

    4. Heap-Dynamic Array the binding of subscript ranges andstorage allocation is dynamic, and can change any number oftimes during the arrays lifetime.

    Arrays can grow and shrink during execution as the need forspace changes.

    Eg. Visual Basic

    Dim StudArr() as String;Redim StudArr(10) as String;Redim Preserve StudArr(15) as String;

    The number of subscripts in arrays may vary.Eg.FORTRAN I: Limited to 3 dimensions only

    FORTRAN IV: Up to 7 dimensions

    Contemporary Language : no limitation

  • 7/31/2019 Manual on Programming Languages

    23/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    23

    Array InitializationFortran 77:

    INTEGER LIST (3)DATA LIST /O, 5, 5/

    ANSI C/C++:int list[] = {4, 5, 7, 83};char name[] = Freddie;char *names[] = {Jo, Bob, Jake, Darcie};

    Ada:LIST : array (1..5) of INTEGER := (1, 3, 5, 7, 9);BUNCH : array (1..5) of INTEGER := (1=>3, 3=>4, others=>0);

    Slices- A slice of an array is some substructure of that array

    See example on Page 213

    Implementation of array types

    - Requires more compile-time effort that simple built-in data types- The code to allow accessing of array elements must be generated

    at compile time.

    - At runtime, this code must be executed to produce elementaddresses

    - Two ways to map multi-dimensional arrays to one-dimensional:1. Row Major Order2. Column Major Order

    - The compile-time descriptor for single-dimensional arrays- The information in the descriptor is required to construct the

    access function.

    Record TypesA record is a possibly heterogeneous aggregate of data elements in which the

    individual elements are identified by names

    Records vs. Arrays

    - heterogeneous - homogeneous

    - fields are named w/ identifiers - referenced by index- allow to include unions

    ARRAY

    Element type

    Index Type

    Number of dimensions

    Index

  • 7/31/2019 Manual on Programming Languages

    24/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    24

    Pascal:

    empRec = recordfn, mi, ln : string[30];dept : string[2];

    endvar emp: empRec;begin

    writeln(emp.dept);end.

    C:

    stypedef struct: empRec{

    char[30] fn, mi, ln;char[2] dept;

    }

    void main(void)

    {empRec emp;printf(%s, emp.dept);

    }

    To reference an element: Dot notation (Pascal and C/C++) % notation (Fortran)

    Fully qualified reference all intermediate record names form thelargest enclosing record to the specified field are named in the reference.

    employee.name := Bob;

    employee.age := 42;employee.sex := Memployee.salary := 23750.00;

    Elliptical reference - record names can be omitted.

    with employee dobegin

    name := Bob;age := 42;sex := M;salary := 23750.0;

    end; {end of with}

  • 7/31/2019 Manual on Programming Languages

    25/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    25

    Implementation of Records:

    Field 1

    Field n

    Union TypesA union is a type that may store different type values at different times

    during program execution

    Design Issues: Should type checking be required? Should unions be embedded in records?

    FORTRAN Union Types (EQUIVALENCE)INTEGER XREAL YEQUIVALENCE (X, Y)

    - X and Y are to cohabit the same storage location- X and Y are aliases- No type checking is done

    ALGOL 68 Union Types

    - The current type value could be detected during runtime- Discriminated Union uses a tag or discriminant- Tag/discriminant identifies the current type value stored

    union (int, real) ir1, ir2

    union (int, real) ir1;int count;. . .ir1 := 33;. . .count := ir1;

    - Conformity clauses solves the problem type checking forunion types

    union (int, real) ir1;int count;real sum;. . .

    Record

    name

    t e

    offset

    name

    t e

    offset

    address

    - fields of records are storedin adjacent memorylocations

    -field accesses are allhandled using the offsets

    The first assignment is legal, but the

    second is not because the system cannotstatically check the type ofir1.

  • 7/31/2019 Manual on Programming Languages

    26/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    26

    case ir1 in(int intval) : count := intval,(real realval) : sum := realval

    esac

    PASCAL Union Types-

    Union is integrated with a record structure- Uses tag or discriminant- Called a Records Variant

    type shape = (circle, triangle, rectangle);object = record

    case form : shape ofcircle: (diameter : real)triangle: (leftside : integer;

    rightside : integer;angle : real);

    rectangle: (side1 : integer;side2 : integer)

    end;

    var figure : object;

    Discriminant (form)

    Problem: user program can change the tag without making the

    corresponding change in the variant.

    Eg.tag := circle;figure.side1 := 25;

    ADA Union Types- The tag cannot be changed without making the corresponding

    change in the variant.

    - Checking the tag is required for all references to variants.- Constrained variant variable storing only 1 possible type

    values in the variant thus allowing static type checking. Tag is treated as named constant

    - Unconstrained variant variable values of the variant can bechanged during execution, however, the whole record should be

    changed including the tag

    Circle:diameter

    Rectangle: side1, side2

    Triangle: leftside, rightside, angle

  • 7/31/2019 Manual on Programming Languages

    27/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    27

    type SHAPE is (CIRCLE, TRIANGLE, RECTANGLE);type OBJECT (FORM : SHAPE) is record

    case FORM iswhen CIRCLE => DIAMETER : FLOAT;when TRIANGLE => LEFT_SIDE : INTEGER;

    RIGHT_SIDE : INTEGER;

    ANGLE : FLOAT;when RECTANGLE => SIDE_1 : INTEGER;SIDE_2 : INTEGER;

    end case;end record;

    - FIGURE_1 : OBJECT; // unconstrained no initial values- FIGURE_2 := OBJECT (FORM => TRIANGLE); // constrained

    Set TypesA set is one whose variables can store unordered collections of distinct

    values from some ordinal type called its base type.

    Set types are often used to model mathematical sets.

    Sets in Pascal and Modula-2- represent sets as bit string that fit into a single machine word.- Set operations:

    Set union Set intersection Set difference Set equality

    type colors =(red, blue, green, yellow, orange, white, black);colorset = set of colors;

    var set1, set2 : colorset;

    Constant values can be assigned to the set variables set1 and set2 as in

    set1 := [red, blue, yellow, white];set2 := [black, blue];

    - Set types are usually stored as bit strings in memory- Present element set to 1 (set bit)- Absent element set to 0 (clear bit)

    Set Operations:

    type chars = a. . . g

    charset = set of chars

    var set1, set2 : charsetset3 : charset

    beginset1 = [a, c, f, e] // 1010110set2 = [a, b, c, g] // 1110001

  • 7/31/2019 Manual on Programming Languages

    28/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    28

    1. Union: set1 U set2set3 := set1 + set2;set3

  • 7/31/2019 Manual on Programming Languages

    29/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    29

    2 fundamental pointer operations:

    1. Assignment set a pointer variable to the address of some object2. Dereferencing allows a pointer to be followed to the data object to

    which it points.

    2 Problems that can be encountered when performing pointer operations:

    1. Dangling Pointers or dangling reference2. Garbage (lost object)- In most languages, pointers are used in heap management

    2 types of heap elements:1. Fixed-size allocation heap

    - All heap storage are allocated and deallocated in units of a singlesize

    - All cells are linked together using the pointers in the cells,forming the free space list.

    - Allocation depends on the next available space]- A dynamic variable can be pointed out by more than one pointer,

    making it impossible to determine when the variable is no longeruseful to the program.

    - Creation of a collection of cells that are no longer accessible andshould be deallocated is also possible

    Ex.

    var p, q, r : ^integer; i: integer;begin

    new(p); p^ := 4;new(q) q^ := 5;new(p); p^ := 3;q := p;dispose(p);new(r); r^ := 5;q^ := 0;new(p); p^ := 5;i := p^/r^;

    end;

    Solutions to the Dangling Pointer Problem1. Use of Tombstones

    - The actual pointer variable points only to tombstonesand never to dynamic variables.

    - When a dynamic variable is deallocated, the tombstoneremains but is set to nil, indicating that the dynamicvariable no longer exists.

    Tombstone Dynamic Variable

  • 7/31/2019 Manual on Programming Languages

    30/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    30

    2. Locks-and-Keys- Pointer values are represented as ordered pairs, where

    the key is an integer value.- Dynamic variables are represented as storage for the

    variable plus a header cell that stores an integer lockvalue.

    Ways to reclaim garbage

    1. Reference Counters (eager approach)- Reclamation is incremental and is done when

    inaccessible cells are created.

    2. Garbage Collection (lazy approach)- Reclamation only occurs when the list of available space

    becomes empty.

    Abstract Data TypesData Abstraction

    An Abstract Data Type is defined as:

    1. A set ofdata objects, ordinarily using one or more type definitions2. A set ofabstract operations on those data objects and3. Encapsulation of the whole in such a way that the user of the new

    type cannot manipulate data objects of the operations defined.

    Basic Terminologies:

    Information Hiding - clients cannot change the underlying representation ofobjects directly.

    Type Definitions defines the structure of a data object with its possiblevalue bindings.

    Example:

    class my_stack{private

    int top, element[n];public

    my_stack();void pop(int *item);void push(int item);void s_top();int s_empty();

    };

    my_stack::my_stack(){top = 0}void my_stack::pop(int *item){

    if (top==0) printf(Stack Empty!);else {*item = element[top];

    top--;}

    }

  • 7/31/2019 Manual on Programming Languages

    31/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    31

    void my_stack::push(int item){if(top==(n-1)) printf(Stack Full!);

    else { element[top]=item;top++;

    }}int my_stack::s_empty(){if(top==0) return 1;

    else return 0;}

    void my_stack::s_top(){if(top==0)printf(Stack Empty);

    else printf(%d\n, element[top-1]);}

    class my_stack

    void pop(int *item);void push(int item);void s_top();int s_empty();

    Name of the abstract data object.Creation of an object of type

    my_stack is similar to declaring an

    ordinary variableMy_stack S1, S2;

    These are the methods or functionsdeclared inside my_stack. These

    methods can only be accessed bymy_stack objects, making itencapsulated.

  • 7/31/2019 Manual on Programming Languages

    32/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    32

    SYNTAX AND SEMANTICS

    4.1 Introduction

    Programming language implementers must be able to determine how theexpressions, statement, and program units of a language are formed, and also their

    intended effect when executed.

    Syntax is the form of its expressions, statements, and program units.Semantics is the meaning of those expressions, statements, and program units.

    4.2 Syntax

    A language is the set of strings of characters from some alphabet

    A sentence or statement is the strings of a language.

    The syntax rules of a language specify which strings of characters from the

    languages alphabet are in the language.

    A lexeme is a small syntactic unit of a language.

    A program is a string of lexemes.

    A token of a language is a category of its lexeme.

    Example: C statement:

    index = 2 * count + 17;

    Lexemes Tokensindex identifier= equal_sign2 int_constant* mult_opcount indentifier+ plus_op17 int_constant; semicolon

    4.3 Formal Methods of Describing Syntax

    BACKUS-NAUR FORM (BNF)- Used to specify programming language syntax- A metalanguage (a language that is used to describe another language)Four components:

    - set of production rules or grammar- set of nonterminal symbols- set of terminal symbols- start symbol

  • 7/31/2019 Manual on Programming Languages

    33/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    33

    LHS RHS

    Example:

    =

    - The symbol on the left side of the arrow, which is aptly called the left-hand side (LHS), is the abstraction being defined.- The text to the right of the arrow is the definition of the LHS. It is called

    the right-hand side (RHS) and consists of some mixture of tokens,lexemes, and references to other abstractions.

    - Altogether, the definition is called a rule, or production.- The abstraction in a BNF description, or grammar, are often called

    nonterminal symbols, or simply nonterminals.- The lexemes and tokens of the rules are called terminal symbols, or

    simply terminals.

    - A grammar is simply a collection of rules.- Nonterminal symbols can have two or more distinct definitions,

    representing two or more possible syntactic forms in the language,

    separated by the symbol |, meaning logical OR.

    Example: PASCAL if

    if then if then else

    or with the rule

    if then

    | if then else

    Given the following statement, write the corresponding BNF.1. var a: integer;

    var : ; a | b | c

    integer | char | string | real

    2. int a; ;

    int | float | char

    Accepted vs. Rejected Sentences- A string or sentence is accepted if it is part of the languages alphabet- 2 ways to check if a string is accepted:

    1. DERIVATION2. PARSE TREES

    Derivation- It is the process of generating the valid or accepted sentences of a

    language by applying a sequence of the production rules, beginning withthe start symbol.

    - 2 types:1. Leftmost Derivation always replace the leftmost nonterminal2. Rightmost Derivation always replace the rightmost nonterminal

  • 7/31/2019 Manual on Programming Languages

    34/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    34

    - A string is accepted if at the end of the derivation only terminal symbolswere left; otherwise, the string is rejected.

    - Example: assignment begin end

    | ; :=

    A | B | C

    +

    | -

    |

    - A derivation of a program in this language follows: => begin end

    => begin ; end=> begin := ; end

    => begin A := ; end

    => begin A := + ; end=> begin A := B + ; end

    => begin A := B + C; end

    => begin A := B + C; end=> begin A := B + C; := end=> begin A := B + C; B := end

    => begin A := B + C; B := end

    => begin A := B + C; B := C end

    - Another example: Simple assignment statements :=

    A | B | C +

    | * | ()

    |

    A := B * (A + C)

    Is generated by the leftmost derivations:

    => := => A :=

    => A := *

    => A := B *

    => A := B * ()=> A := B * ( + )=> A := B * (A + )

    => A := B * (A + )

    => A := B * (A + C)Therefore, the statement is accepted.

    Will the statement, B := A * (A * B + C) be accepted or rejected?

  • 7/31/2019 Manual on Programming Languages

    35/36

    Programming Languages

    UNIVERSITY OF THE CORDILLERAS

    35

    PARSE TREE

    - A hierarchical syntactic structure. It pictorially shows how the start symbolof a grammar derives a string in the language.

    Root node start symbol Interior nodes nonterminal symbols Leaf nodes token or terminal symbolsNote:

    If a grammar that generates a sentence for which there are two or moredistinct parse trees is said to be ambiguous, therefore has more than one

    meaning that may cause misinterpretation.

    - Example: assignment begin end

    | ; := A | B | C

    + | -

    |

    Parse Tree Representation: begin A:= B + C; B := C end

    begin end

    ;

    :=

    A + :=

    B C B

    C

    Exercise:S aAS

    | aA SbA

    | SS| ba

  • 7/31/2019 Manual on Programming Languages

    36/36

    Programming Languages 36

    Determine whether the following statement is ACCEPTED or REJECTED

    1. aabbaa

    2. abba3. abaabaa

    EXTENDED BNF (EBNF)- Increases readability and writability

    Meta symbols or notations used:

    | Multiple Choice (used to represent alternative definitions)[ ] Optional Part

    { }* Repeated zero or more times

    { }+ Repeated one or more times

    Example:

    BNF: +

    | -

    | *

    | / |

    EBNF: {(+ | -) }*

    {(* | /) }*

    Given the C variable declaration, write the corresponding EBNF

    int A;int A, B;char C, D, E;

    Given the if-then-else statement in ADA, write the corresponding EBNF

    1. if then ;2. if then ;

    else ;3. if then ;

    elsif then ;else ;

    4. if then ;elsif then ;

    elsif then ;

    else ;