Upload
virginia-fox
View
223
Download
4
Embed Size (px)
Citation preview
Specifying LanguagesSpecifying Languages
CS 480/680 – Comparative LanguagesCS 480/680 – Comparative Languages
Language Specification 2
Specifying a LanguageSpecifying a Language Informal methods
• Textbooks, tutorials, etc.
Formal definitions• Needed for exactness
Compiler writers, etc.
• Like technical specifications for design
Syntax – what expressions are legal? Semantics – what should they do?
Language Specification 3
Context Free GrammarsContext Free Grammars Definition: A context-free grammar (CFG) is
a 4-tuple, G = (V, , R, S)• V = variables, non-terminal symbols = terminal symbols (alphabet)• R = production rules• S = start symbol, S V
V, , R, S are all finite
Language Specification 4
A Context Free GrammarA Context Free Grammar V = A, B = (a, b) R = A
aAaA BB
bBbB AB
S = A
A aAa A aAa aaAaa A aAa aaBaa A B aabBbaa B bBb aabbBbbaa B bBb aabbbba B
What language does this grammar specify?
Language Specification 5
Another Example CFGAnother Example CFG V = A = (a, b) R = A
aAaA
bAbA aA bA
S = A
What language does this grammar specify?
Language Specification 6
More examplesMore examples Write a CFG for the following languages:
“All strings consisting of one or more a’s, followed by twice as many b’s.”
“Strings with more a’s than b’s.”
There is an entire class devoted to formal specifications of languages: CS 466/666 – Introduction to Formal Languages
Language Specification 7
A CFG for Integer Arithmetic ExpressionsA CFG for Integer Arithmetic Expressions V = <num>, <digit>, <op>, <expr> = [(, ), 0…9, , , , ] R = <expr> <num>
<expr> <op> <expr> (<expr>)
<num> <digit><num> | <digit><digit> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9<op> | | |
S = <expr>
Language Specification 8
Derivation of an ExpressionDerivation of an Expression <expr> <expr> <op> <expr>
(<expr>) <op> <expr> (<expr>) + <expr> (<expr> <op> <expr>) + <expr> (<expr> <expr>) + <expr> (<num> <expr>) + <expr> (<digit><num> <expr>) + <expr> (<digit><digit> <num>) + <expr> (7<digit> <num>) + <expr> (73 <num>) + <expr> (73 <digit>) + <expr> (73 4) + <expr> (73 4) + <num> (73 4) + <digit> (73 4) + 9
Language Specification 9
Parse TreesParse Trees The derivation of an expression can also be
expressed as a tree This parse tree can help to resolve the
interpretation of an expression A compiler reads in the source code, and
produces a parse tree before generating code.
Language Specification 10
Example Parse TreeExample Parse Tree A simple CFG: E E E | 0 | 1 E E E
E E E 1 E E 1 0 E 1 0 1
E
E E
E E
1
1
0
E
E E
E E
1
1
0
Since there aretwo parse trees for this expression, the grammar is ambiguous.(Note: the order of substitution is not the issue.)
Since there aretwo parse trees for this expression, the grammar is ambiguous.(Note: the order of substitution is not the issue.)
(1 – 0) – 1 1 – (0 – 1)
Language Specification 11
AmbiguityAmbiguity If there are two parse trees for any expression,
the grammar is syntactically ambiguous Programming languages should be specified by
unambiguous grammars• Otherwise it is difficult to determine the semantics
of a syntactically correct statement• a = b + c * d;• Conventions (like operator precedence) can be used
to clarify syntactically ambiguous grammars
Language Specification 12
Disambiguating a grammarDisambiguating a grammar We can disambiguate our simple grammar by
adding explicit parentheses: E (E E) | 0 | 1 E E E
(E E) E (1 E) E (1 0) E (1 0) 1
In general, you can remove ambiguity in a grammar by imposing state in the derivation.
Language Specification 13
An ambiguous grammarAn ambiguous grammar S aSb | aSbb | Language: L = {anbm | 0 n m 2n}
• The number of b’s is between the number of a’s and twice the number of a’s
aabbb can be generated two ways
Disambiguating:• Step 1: Produce all a’s with matching b’s
• Step 2: Produce all extra b’s.
S aSb | A | A aAbb | abb
Language Specification 14
BNFBNF Backus-Naur Form A standard notation for CFG’s, often used in
specifying languages• Non-terminals (variables) are enclosed in <>
<expression>, <number> <empty> =
is the production symbol ()• | is used for “or”
Language Specification 15
BNF ExampleBNF Example <real-number> <integer-part> . <fraction> <integer-part> <digit> | <integer-part> <digit> <fraction> <digit> | <digit><fraction> <digit>
Can we generate the number “.7” from this grammar?
Language Specification 16
Extended BNFExtended BNF Makes some constructs easier to specify No more powerful than BNF Rules:
• { } = “zero or more”• [ ] = “optional” or, equivalently “zero or one”• | = “or”• ( ) are used for grouping
Language Specification 17
Arithmetic ExpressionsArithmetic Expressions <expression> ::= <expression> + <term>
| <expression> – <term>| <term>
<term> ::= <term> * <factor>| <term> / <factor>| <factor>
<factor> ::= number | name | | (<expression>)
<expression> ::= <term> { (+| – ) <term> } <term> ::= <factor> { (*| / ) <factor> } <factor> ::= ‘(’ <expression> ‘)’ | number | name