Upload
kiran-acharya
View
404
Download
0
Embed Size (px)
DESCRIPTION
This a powerpoint presentation about the lexical analuser
Citation preview
LEXICAL ANALYZER
By : Kiran Acharya
inspireray.blogspot.in
inspireray.blogspot.in
TOPICS COVERED Introduction to lexical Analyzer Input buffering Specifications of tokens Regular expressions
inspireray.blogspot.in
REGULAR EXPRESSIONSThere are two basic rules in regular
expression.1. ɛ is a regular expression and L(ɛ), that
is the language whose sole number is an empty string.
2. If a is a symbol in Ʃ , then a is a regular expression with one string of length one , with a at its first position.
inspireray.blogspot.in
INDUCTION (r)|(s) is a regular expression denoting
L(r)ỤL(s). (r)(s) is a regular expression means
L(r)L(s). (r)* means regular expression(L(r)).
inspireray.blogspot.in
Unary operator * has the highest
precedence and its left associative. Concatenation has second and its left
associative. | has the lowest and left associative.
inspireray.blogspot.in
Language that can be defined by regular
expression is called regular set if regular expression r and s are from
same set they are equivalent. (a|b)=(b|a)
inspireray.blogspot.in
REGULAR DEFINITIONS It’s a sequence of definitions of the
form: d1→r1;
d2→r2; each d is a new symbol r is regular expression over alphabets.
inspireray.blogspot.in
EXTENSION OF REGULAR EXPRESSION Kleene closure extended in 50’s One or more instances Zero or one instance. Character class.
inspireray.blogspot.in
RECOGANITION OF TOKENS Taking the patterns from tokens and
build piece of code that examines the input find the prefix that is the lexeme matching one of the pattern.
Methods:1. Transition Diagrams2. Recognition of reverse words and
identifier3. Completion of running example.
inspireray.blogspot.in
TRANSITION DIAGRAMS These are first flow charts. Conversion from patterns to transition
diagram. It has states. Edges input
inspireray.blogspot.in
NOTE ON TD First and final state Accepting state Start state
inspireray.blogspot.in
RECOGNITION OF REVERSE WORDS AND IDENTIFIER Finding keywords and identifiers are the
problem.Return(gettoken(),installid())
0 10 11start
letter
Letter or digit
other
inspireray.blogspot.in
LEXICAL ANALYZER GENERATOR Tool lex Input is lex language tool itself is a lex
compiler. Input file is lex.l Compiler transform it into c program Lex.yy.c And later the file is compiled by c to
a.out
inspireray.blogspot.in
STRUCTURE OF LEX PROGRAM Declarations: %% Translation rules %% Auxiliary functions
inspireray.blogspot.in
CONFLICT RESOLUTION IN LEX Always prefer a longer prefix over the
shorter It longer matches the two or more
patterns then prefer the pattern listed first.
Look Ahead operator: / is inserted to know the end of the part of lexeme.
inspireray.blogspot.in
FINITE AUTOMATE The heart of the transition of lex turning
input program to lexical analyzer is finite automata.
Finite automata are recognizers they just say yes or no.
Two types:1. Non deterministic2. Deterministic
inspireray.blogspot.in
NON DETERMINISTIC No restrictions to the edges from the
same state. Finite state of state s Input alphabet Ʃ Transition function Start state Final state
inspireray.blogspot.in
TRANSITION TABLEState a b ɛ
0 {0,1} {0} ф
1 ф {2} ф
2 ф {3} ф
3 ф ф Ф
inspireray.blogspot.in
ACCEPTANCE OF INPUT STRING BY NFA aabb (a|b)* abb
inspireray.blogspot.in
DFA There is exactly one edge form the input
to the next state.