37
Course: ICS313 Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali [email protected] Lecturer, College of Computer Science and Engineering University of Hail

Course: ICS313 Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

  • Upload
    thanos

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

Course: ICS313 Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali [email protected] Lecturer, College of Computer Science and Engineering University of Hail. Chapter 3: Describing Syntax and Semantics. Objectives:. - Introduction - PowerPoint PPT Presentation

Citation preview

Page 1: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

Course: ICS313 Fundamentals of Programming Languages.

Instructor:

Abdul Wahid [email protected]

Lecturer, College of Computer Science and Engineering

University of Hail

Page 2: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

Chapter 3: Describing Syntax and Semantics

- Introduction- The General Problem of Describing Syntax- Formal Methods of Describing Syntax- Attribute Grammars- Describing the Meanings of Programs: Dynamic Semantics

Objectives:

2

Page 3: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

Who must use language definitions?-Language designers, - Implementers- Programmers (the users of the language)

- A formal description of the language is essential in learning, writing, and implementing the language. Thus description must be precise and understandable.

Introduction

A language is a set of sentences or statements. Sentences or statements are the valid strings of a language. It consists of valid alphabets sequenced in a way that is consistent with the grammar of the language.

3

Page 4: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

- Syntax describes what the language looks like. Semantics determines what a particular construct actually does in a formal way.- Syntax is much easier to describe than semantics.

What is the meaning of Semantics?The meaning of the expressions, statements, and program units.

What is the meaning of Syntax? Its means, the form or structure of the expressions, statements and program units. The Syntax rules of a language specify which strings of characters from the language’s alphabet are in the language.

- Languages are described by their syntaxes and semantics

4

Page 5: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

Formal descriptions of the syntax of programming languages, for simplicity sake, often do not include descriptions of the lowest-level syntactic units. These small units called lexemes.

A lexeme is basic component during lexical analysis. A lexeme consists of related alphabet from the language. (e.g. numeric literals, operators, and special words).

Lexemes are partitioned into groups, such as, the names of variables, methods, classes, and so forth, each of these groups is represented by a name, of a token. A token is a category of related lexemes, and a lexeme is an instance of a token. (e.g., identifier)

3.2 The General Problem of Describing Syntax: : Terminology

5

Page 6: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

Lexemes Tokens

Index Identifier

= Equal_sign

2 Int_literal

* Mult_op

17 Int_literal

; semicolon

example: consider the following Java statement

index = 2 * count +17;The lexemes and tokens of this statement are represented the coming table

In general, languages can be formally defined in two distinct ways: A language can be (1) generated or (2) recognized.

6

Page 7: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

3.2.1 Language RecognizerA recognizer of a language identifies those strings that are within a language from those that are not.The lexical analyzer and the parser of a compiler are the recognizer of the language the compiler translates. The lexical analyzer recognizes tokens, and the parser recognizes the syntactic structure.

3.2.2 Language GeneratorsA generator generates the valid sentences of a language.In some cases it is more useful than the recognizer since we can “watch and learn”.

7

Page 8: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

3.3 Formal Methods of Describing Syntax

The formal language that is used to describe the syntax of

programming languages is called grammar.

3.3.1 BNF (Backus-Naur Form) and Context-Free GrammarsBNF is widely accepted way to describe the syntax of a programming language.

3.3.1.1 Context-Free Grammars Regular expression and context-free grammar are useful in describing the syntax of a programming language. Regular expression describes how a token is made of alphabets, and context-free grammar determines how the tokens are put together as a valid sentence in the grammar.

8

Page 9: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

3.3 Formal Methods of Describing Syntax

3.3.1.2 Origins of Backus- Naur From (BNF)John Backus and Peter Naur used BNF to describe ALGOL 58 and ALGOL 60, and BNF is nearly identical to context-free grammar.

3.3.1.3 BNF Fundamentals• BNF is a metalanguage, i.e., a language that describes a language, which can describe a programming language.• The BNF consists of rules (or productions). A rule has a left-hand side (LHS) as the abstraction, and a right-hand side (RHS) as the definition. As in java assignment statement the defintion could be as follows:

<assign> → <var> = <expression>

9

Page 10: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

The LHS is a non-terminal and the RHS called terminal.

• The rule indicates that whenever you see a token on the LHS, you can replace it with the RHS, just like expanding a non-terminal into its children in a tree.

3.3.1.4 Describing Lists

Recursion is used in BNF to describe lists.

3.3.1.5 Grammars and Derivations

- BNF is a generator of the language. The sentences of a language can be generated from the start symbol by applying a series of rules on it. The process of starting from the start symbol to the final sentence is called a derivation.

10

Page 11: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

- Replacing a non-terminal with different RHS’s may derive different sentences.- Each sentence during a derivation is a sentential form. If we replace every leftmost non-terminal of the sentential form, the derivation is leftmost. However, the set of sentences generated are not affected by the derivation order.

11

Page 12: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

Describing Syntax and Semantics

3.3.1.5 Grammars and Derivations

- BNF is a generator of the language. The sentences of a language can be generated from the start symbol by applying a series of rules on it. The process of starting from the start symbol to the final sentence is called a derivation.

Example (3.1) a grammar for Small language<program> → begin <stmt_list> end<stmt_list> → <stmt>

| <stmt>; <stmt_list> <stmt> → <var>= <expression><var> → A| B|C<expression> → <var + var>

| <var> - <var> | <var>

12

Page 13: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

A derivation of a program in this language follows:

- Replacing a non-terminal with different RHSs may derive different sentences.- Each sentence during a derivation is a sentential form. If we replace every leftmost non-terminal of the sentential form, the derivation is leftmost. However, the set of sentences generated are not affected by the derivation order.

<program> => begin <stmt_list> end => begin <stmt> ; <stmt_list> end => begin <var> = <expression>; <stmt_list> end => begin A= <expression>; <stmt_list> end => begin A = <var> + < var> ; <stmt_list> end

=> begin A = B+ < var> ; <stmt_list> end => begin A = B + C; <stmt> end => begin A = B + C ; <var> = <expression> end => begin A = B + C ; B = <expression> end => begin A = B + C ; B = C end

13

Page 14: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

A = B*(A+C) is generated by the leftmost derivation:

<assign> => <id> = <expr> => A = <expr> => A = <id> * <expr> => A = B * <expr> => A = B * (<expr>) => A = B * (<id> + <expr>) => A = B * (A + <expr>) => A = B * (A + <id>) => A = B * (A + C)

Example (3.2) a grammar for a Simple Assignment Statements <assign> → <id> = <expr><id> → A| B| C |<expr> → <id> + <expr>

| <id>*<expr> | (<expr>) | <id>

14

Page 15: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

3.3.1.6 Parse Trees: A derivation can be represented by a tree hierarchy called parse tree. The root of the tree is the start symbol and applying a derivation rule corresponds to expand a non-terminal in a tree into its children.

<assign>

A parse tree for the simple statement: A = B * (A + C)

<id> = <expr>

<id> * <expr>

( <expr> )

<id> + <expr>

A

B

A <id>

C 15

Page 16: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

3.3.1.7 Ambiguity: A grammar is ambiguous if for a given sentence, there is more than one parser tree, i.e., there are two derivations that lead to the same sentence.

<assign>Two distinct parse trees for the same sentence, A = B +A * C

<id> = <expr>

<expr> + <expr>

<expr> * <expr>

<id>

A

<id>

A

<id>

C

B

<assign>

<id> = <expr>

<expr> * <expr>

<expr> + <expr>

<id>

A

<id>

B

<id>

C

A

Formal Method of Describing Syntax (contd.)

16

Page 17: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

3.3.1.8 Operator Precedence: Operator Precedence can be maintained by modifying a grammar so that operators with higher precedence are grouped earlier with its operands, so that they appear lower in the parse tree.

<expr> => <expr> + const | const (unambiguous)

<expr> => <expr> + <expr> | const (ambiguous)

<expr>

<expr> + <const>

<const>

<expr> + <const>

17

Associativity of OperatorsOperator associativity can also be indicated by a grammar

Page 18: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

- Optional parts are placed in brackets ([ ])

<proc_call> → ident [ ( <expr_list>)]

- Alternative parts of RHSs in parentheses and separate them with vertical bars

<term> → <term> (+ | -) const

- Put repetitions (0 or more) are placed inside braces ({})

<ident> → letter {letter | digit}

3.3.2 Extended EBNF

- EBNF has the same expression power as BNF

- The addition includes optional constructs (parts), repetition, and multiple choices, very much like regular expression.

18

Page 19: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

3.3.2 Extended EBNF

BNF

<expr> <expr> + <term> | <expr> - <term> | <term>

<term> <term> * <factor> | <term> / <factor> | <factor>EBNF

<expr> <term> {(+ | -) <term>}

<term> <factor> {(* | /) <factor>}

19

Page 20: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

3.3.4 Attribute Grammars

An attribute grammar is an extension to a context –free grammar . It is used to describe more of the structure of a programming language than can be described with context-free grammar. The extension allows certain language rules to be described, such as type compatibility.

3.3.4.1 Static Semantics- There are many restrictions on programming languages that are either difficult or impossible to describe in BNF, however, they can be described when we add attributes to the terminal/non-terminals in BNF.- These added attribute and their computation could be computed at compile-time, thus the name static semantics.

20

Page 21: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

3.3.4 Attribute Grammars

3.3.4.1 Static Semantics (contd.)

- Consider a rule which states that a variable must be declared before it is referenced.

-- Cannot be specified in a context-free grammar.-- Can be tested at compile time.

- Some rules can be specified in the grammar of a language, but will unnecessarily complicate the grammar.

e.g. a rule in JAVA that states that a string literal cannot be assigned to a variable which was declared to be type int.

21

Page 22: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

3.3.4 Attribute Grammars

3.3.4. 2 Basic concepts

An attribute grammar consists of, in addition to its BNF rules, the attributes of grammar symbols, a set of attribute computation functions (or semantic functions), and predicate functions that determine whether a rule could be applied. The latter two functions are associated with the grammar rules. Thus basic concepts are:

- Attribute grammars are grammars to which have been added attributes, attribute computation functions and predicate functions.

- Attributes associate with grammar symbols, are similar to variables in the sense that they can have values assigned to them.

22

Page 23: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

- Attribute computation functions semantic functions, are associated with grammar rules. They are used to specify how attribute values are computed.

- Predicate function which state some of the syntax and static semantic rules of the language are associated with grammar rule.

23

Page 24: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

3.3.4. 3 Attribute grammars defined

Definition: An attribute grammar is a context-free grammar (CFG) with the following additions:- For each grammar symbol x there is a set A(x) of attribute

values.- Each rule has a set of functions that define certain attributes

of the non-terminals in the rule.- Each rule has a set of predicates to check for attribute

consistency.

24

-Associate with each grammar symbol X is a set of attributes A(X).

- A(X) consists of two disjoint sets S(X) and I(X), called synthesized and inherited attributes

Page 25: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

Attribute Grammars (continued)

- Synthesized attributed are used to pass semantic information up a parse tree.

- Inherited attributes pass semantic information down a parse tree.

- Let X0 X1 ……………… Xn be a rule

- Function of the form I (Xn) = f (A(X0), …A (Xn)) define synthesized attributes.

- Functions of the form I(Xj) = f(A(X0), ... , A (Xn)), for i <= j <= n, define inherited attributes

25

Page 26: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

Attribute Grammars (continued)

Example: expressions of the form id + idid's can be either int_type or real_typetypes of the two id's must be the sametype of the expression must match it's expected type

BNF<expr> <var> + <var><var> id

Attributes:Actual_type: A synthesized attribute associated with

the non-terminals for <var> and <expr>Exceted_type: An inherited attribute associated with

the non-terminal <expr>26

Page 27: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

An attribute grammar for simple assignment statements

1. Syntax rule: <assign> <var> = <expr>

Semantic rule: <expr>.expected_type <var>.actual_type

2. Syntax rule: <expr> <var>[2] + <var>[3]

Semantic rule: <expr>.actual_type

if (<var>[2].actual_type =int) and

(<var>[3].actual_type =int)

then int

else real

end if

Predicate: <expr>.actual_type = <expr>.expected_type

27

Page 28: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

An attribute grammar for simple assignment statements

3. Syntax rule: <expr> <var>

Semantic rule: <expr>.actual_type <var>.actual_type

Predicate: <expr>.actual_type = <expr>.expected_type

4. Syntax rule: <var> A | B | C

Semantic rule: <var>.actual_type look-up (<var>.string)

The look-up function looks up a given name in symbol table and returns the variable’s type.

28

Page 29: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

Describing the meaning of Programs: Dynamic Semantics

- There is no single widely acceptable notation or formalism for describing semantics- The dynamic semantics of a program is the meaning of its expressions, statements, and program units. Dynamic semantic determines the meanings of programming constructs during their execution.

- Accurately describing semantics is essential so that...- ...users writing a program can precisely understand how the various language constructs work.- ...compilers will be implemented consistently with respect to one another.

- Three methods that are used to describe semantics formally:- Operational semantics- Axiomatic semantics-Denotational semantics

29

Page 30: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

1- Operational Semantics

- Describe the meaning of a program by executing its statements on a machine, either simulated or actual. The change in the state of the machine (memory, registers, etc.) defines the meaning of the statement

- The for loop in C++ or Java...

for (expr1; expr2; expr3) { stmt}

30

Page 31: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

2- Axiomatic Semantics

- Axiomatic semantics defined in conjunction with the development of a method to prove the correctness of a program.- Axiomatic semantics is based on mathematical logic. The logical expressions are called predicated, or assertions. - An assertion immediately following a statement describes a new constraints on those variables after execution of the statement.

31

- These assertions are called the precondition and post condition.- Developing an axiomatic description or proof of a given program requires that every statement in the program have both a precondition and a post condition.

Page 32: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

- Based on formal logic (first order predicate calculus)- Original purpose: formal program verification- Approach: Define axioms or inference rules for each statement type in the language (to allow transformations of expressions to other expressions)- The expressions are called assertions

An assertion before a statement (a precondition) states the relationships and constraints among variables that are true at that point in execution

- An assertion following a statement is a post-condition

32

Page 33: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

Example : Show that the program segmenty := 2z := x + y ,is correct with respect to the initial assertion {P} : x = 1 and the final assertion{Q}: z = 3.

Solution: Suppose that p is true, so that x=1 as the program begins.Then y is assigned 2 and z is assigned the sum of x and y, which is 3. Hence program segment is correct..

x y z1 2 1 + 2 = 3 , thus : {P} statement {Q} is true.

33

Page 34: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

3 – Denotational Semantics

- Based on recursive function theory- The most abstract semantics description method-Originally developed by Scott and Strachey (1970)

- The process of building a denotational specification for a language (not necessarily easy):

-Define a mathematical object for each language entity

-Define a function that maps instances of the language entities onto instances of the corresponding mathematical objects

- The meaning of language constructs are defined by only the values of the program's variables

- is the most rigorous, widely known method for describing the meaning of programs

34

Page 35: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

The difference between denotational and operational

semantics:

In operational semantics, the state changes are defined by

coded algorithms; in denotational semantics, they are defined

by rigorous mathematical functions

The state of a program is the values of all its current variables

s = {<i1, v1>, <i2, v2>, …, <in, vn>}

- Let VARMAP be a function that, when given a variable name

and a state, returns the current value of the variable

VARMAP(ij, s) = vj

35

Page 36: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

(e.g.) Decimal Numbers<dec_num> '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | <dec_num> ('0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9')

Mdec('0') = 0, Mdec ('1') = 1, …, Mdec ('9') = 9

Mdec (<dec_num> '0') = 10 * Mdec (<dec_num>)

Mdec (<dec_num> '1’) = 10 * Mdec (<dec_num>) + 1…

Mdec (<dec_num> '9') = 10 * Mdec (<dec_num>) + 9

1-36

Page 37: Course:  ICS313  Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali

1-37

SummaryBNF and context-free grammars are equivalent meta-

languagesWell-suited for describing the syntax of programming

languagesAn attribute grammar is a descriptive formalism that

can describe both the syntax and the semantics of a language

Three primary methods of semantics descriptionOperation, axiomatic, denotational------------------------------------END------------------------------------------------