20
LEXICAL ANALYZER By : Kiran Acharya inspireray.blogspot .in

Lexical analyzer

Embed Size (px)

DESCRIPTION

This a powerpoint presentation about the lexical analuser

Citation preview

Page 1: Lexical analyzer

LEXICAL ANALYZER

By : Kiran Acharya

inspireray.blogspot.in

Page 2: Lexical analyzer

inspireray.blogspot.in

TOPICS COVERED Introduction to lexical Analyzer Input buffering Specifications of tokens Regular expressions

Page 3: Lexical analyzer

inspireray.blogspot.in

REGULAR EXPRESSIONSThere are two basic rules in regular

expression.1. ɛ is a regular expression and L(ɛ), that

is the language whose sole number is an empty string.

2. If a is a symbol in Ʃ , then a is a regular expression with one string of length one , with a at its first position.

Page 4: Lexical analyzer

inspireray.blogspot.in

INDUCTION (r)|(s) is a regular expression denoting

L(r)ỤL(s). (r)(s) is a regular expression means

L(r)L(s). (r)* means regular expression(L(r)).

Page 5: Lexical analyzer

inspireray.blogspot.in

Unary operator * has the highest

precedence and its left associative. Concatenation has second and its left

associative. | has the lowest and left associative.

Page 6: Lexical analyzer

inspireray.blogspot.in

Language that can be defined by regular

expression is called regular set if regular expression r and s are from

same set they are equivalent. (a|b)=(b|a)

Page 7: Lexical analyzer

inspireray.blogspot.in

REGULAR DEFINITIONS It’s a sequence of definitions of the

form: d1→r1;

d2→r2; each d is a new symbol r is regular expression over alphabets.

Page 8: Lexical analyzer

inspireray.blogspot.in

EXTENSION OF REGULAR EXPRESSION Kleene closure extended in 50’s One or more instances Zero or one instance. Character class.

Page 9: Lexical analyzer

inspireray.blogspot.in

RECOGANITION OF TOKENS Taking the patterns from tokens and

build piece of code that examines the input find the prefix that is the lexeme matching one of the pattern.

Methods:1. Transition Diagrams2. Recognition of reverse words and

identifier3. Completion of running example.

Page 10: Lexical analyzer

inspireray.blogspot.in

TRANSITION DIAGRAMS These are first flow charts. Conversion from patterns to transition

diagram. It has states. Edges input

Page 11: Lexical analyzer

inspireray.blogspot.in

NOTE ON TD First and final state Accepting state Start state

Page 12: Lexical analyzer

inspireray.blogspot.in

RECOGNITION OF REVERSE WORDS AND IDENTIFIER Finding keywords and identifiers are the

problem.Return(gettoken(),installid())

0 10 11start

letter

Letter or digit

other

Page 13: Lexical analyzer

inspireray.blogspot.in

LEXICAL ANALYZER GENERATOR Tool lex Input is lex language tool itself is a lex

compiler. Input file is lex.l Compiler transform it into c program Lex.yy.c And later the file is compiled by c to

a.out

Page 14: Lexical analyzer

inspireray.blogspot.in

STRUCTURE OF LEX PROGRAM Declarations: %% Translation rules %% Auxiliary functions

Page 15: Lexical analyzer

inspireray.blogspot.in

CONFLICT RESOLUTION IN LEX Always prefer a longer prefix over the

shorter It longer matches the two or more

patterns then prefer the pattern listed first.

Look Ahead operator: / is inserted to know the end of the part of lexeme.

Page 16: Lexical analyzer

inspireray.blogspot.in

FINITE AUTOMATE The heart of the transition of lex turning

input program to lexical analyzer is finite automata.

Finite automata are recognizers they just say yes or no.

Two types:1. Non deterministic2. Deterministic

Page 17: Lexical analyzer

inspireray.blogspot.in

NON DETERMINISTIC No restrictions to the edges from the

same state. Finite state of state s Input alphabet Ʃ Transition function Start state Final state

Page 18: Lexical analyzer

inspireray.blogspot.in

TRANSITION TABLEState a b ɛ

0 {0,1} {0} ф

1 ф {2} ф

2 ф {3} ф

3 ф ф Ф

Page 19: Lexical analyzer

inspireray.blogspot.in

ACCEPTANCE OF INPUT STRING BY NFA aabb (a|b)* abb

Page 20: Lexical analyzer

inspireray.blogspot.in

DFA There is exactly one edge form the input

to the next state.