Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Lecture 4. Parsing (syntax analysis)
Wei Le
2015.9
Outline
I review: languages and grammars
I introduction of parsing
I CFG: context free grammars
I derivation
I ambiguity
LanguagesI a theoretical conceptI a set of strings: the set can be infiniteI Chomsky Hierarchy (1956): regular (type 3), context-free (type 2),
context-sensitive (type 1), recursively enumerable (type 0)I Computational complexity: the expressive power of (and the
resources needed in order to process) classes of languagesI The more restricted the rules are, the lower in the hierarchy the
languages they generate are
Grammars
I A specification of languages: syntax
I Grammars are used for compression (biology): learn a grammar froma set of strings
I generators (produce strings) or recognizers (match production rules)
I Compiler: recognizers
I A grammar G is a 4-tuple:I Σ: a set of terminalsI V : a set of non-terminalsI S : start symbolI a set of production rules
I L(G ) represents the language of the grammar
What Can Regular Languages Express?
I Finite number of characters
I Infinite, allow repeated patterns – intuition: finite automaton thatruns long enough must repeat states
I Strings whose patterns related to a fixed number of states
I Finite automaton can’t remember of times it has visited a particularstate
Context Free Grammar
I the production does not dependent on the context
I no terminals on the left side
I A can be replaced to α independent of the context where A islocated
Examples
The set of strings is expressed using the following:
I {a∗b∗}
I {anbn : n > 0} (e.g., (())))
I {anbncn : n > 0}
Examples
The set of strings is expressed using the following:
I {a∗b∗}I {anbn : n > 0} (e.g., (())))
I {anbncn : n > 0}
Examples
The set of strings is expressed using the following:
I {a∗b∗}I {anbn : n > 0} (e.g., (())))
I {anbncn : n > 0}
Examples
S → aSbS → ε
I In every derivation the length of the string never decreases
I The term ”context-sensitive” comes from a normal form for thesegrammars,where each production is of the form α1Aα2 → α1βα2
with β 6= ε
I They permit replacement of variable A by string only in the”context” α1 - α2
Chomsky Hierarchy: A Summary
Programming Languages and Natural Languages
I Programming languages:I C: all the programs written in CI modern programming languages: context free + some
context-sensitive features
I Natural language: English is not regular, not context free (how toprove it?)
Parser
Lexer: scanner, lexical analyzer Parser: syntax analyzer
Example
Role of Parsers
Parsing and CFG
I Recognize if a string (program) is a language (yes, no)
I Generate a parse tree from the input
I Report/handle errors
I Form of the grammar is importantI Many grammars generate the same languageI Tools (e.g.,Bison) are sensitive to the grammar
Context Free Grammars
BNF: Bakus-Naur Form
I late 1950s, early 1960s by John Bakus and Peter Naur
I notation systems for context free grammars
I < SheepNoise >::= baa < SheepNoise > |baa reads ”derives”
I Today, many books use an updated form of BNF: →
Example CFGs
More Understandings for CFGs
Key Ideas
More Understandings for CFGs
More Understandings for CFGs
The Language of a CFG
Terminals
Examples
Derivations
Derivations
Derivations
Derivations
Derivations
Derivations
Derivations
Derivations
Derivations
I A parse tree hasI Terminals at the leavesI Non-terminals at the interior nodes
I An in-order traversal of the leaves is the original input
I In-order traversal: left root right
I The parse tree shows the associativity of operators, the input stringdoes not
I In programming languages, the associativity (or fixity) of anoperator is a property that determines how operators of the sameprecedence are grouped in the absence of parentheses.
Derivations
I A parse tree hasI Terminals at the leavesI Non-terminals at the interior nodes
I An in-order traversal of the leaves is the original input
I In-order traversal: left root right
I The parse tree shows the associativity of operators, the input stringdoes not
I In programming languages, the associativity (or fixity) of anoperator is a property that determines how operators of the sameprecedence are grouped in the absence of parentheses.
Derivations
I A parse tree hasI Terminals at the leavesI Non-terminals at the interior nodes
I An in-order traversal of the leaves is the original input
I In-order traversal: left root right
I The parse tree shows the associativity of operators, the input stringdoes not
I In programming languages, the associativity (or fixity) of anoperator is a property that determines how operators of the sameprecedence are grouped in the absence of parentheses.
Derivations
I A parse tree hasI Terminals at the leavesI Non-terminals at the interior nodes
I An in-order traversal of the leaves is the original input
I In-order traversal: left root right
I The parse tree shows the associativity of operators, the input stringdoes not
I In programming languages, the associativity (or fixity) of anoperator is a property that determines how operators of the sameprecedence are grouped in the absence of parentheses.
Leftmost and Rightmost Derivations
Rightmost Derivations
Rightmost Derivations
Rightmost Derivations
Rightmost Derivations
Rightmost Derivations
Rightmost Derivations
Rightmost Derivations
Rightmost Derivations
Rightmost Derivations
Summary of Derivations
Revisiting Parsing
I A parser knows the grammar of the programming language
I The parser finds the derivation of a particular input
Ambiguity
Definition: A context-free grammar G is ambiguous if some string hastwo or more distinct derivation trees
I A language can have many grammars, some grammars may beambiguous
I Some language only can have ambiguous grammar (inherentAmbiguity)
I Ambiguity is bad for programming languages, and we want toremove ambiguity
Ambiguity
Definition: A context-free grammar G is ambiguous if some string hastwo or more distinct derivation trees
I A language can have many grammars, some grammars may beambiguous
I Some language only can have ambiguous grammar (inherentAmbiguity)
I Ambiguity is bad for programming languages, and we want toremove ambiguity
Ambiguity
Definition: A context-free grammar G is ambiguous if some string hastwo or more distinct derivation trees
I A language can have many grammars, some grammars may beambiguous
I Some language only can have ambiguous grammar (inherentAmbiguity)
I Ambiguity is bad for programming languages, and we want toremove ambiguity
Ambiguity: Example
Ambiguity: Example
Dealing with Ambiguity
Dealing with Ambiguity
Dealing with Ambiguity
Associtivity Declaration
Precedence Declaration