8/13/2019 atcd unit 1
1/58
AUTOMATA
ANDCOMPILER DESIGN
B.Tech III IT B
sumalatha
8/13/2019 atcd unit 1
2/58
UNIT - I
Formal Language and Regular Expressions:
Languages Definition
regular expressions
Regular setsidentity rules.
Finite Automata:
DFA
NFANFA with transitionsSignificance
acceptance of languages
NFA to DFA conversion
minimization of DFA
Finite Automata with outputMoore and Mealy machines
Constructing finite Automata for a given
regular expressions
Conversion of Finite Automata to Regular
expressions.
8/13/2019 atcd unit 1
3/58
What is Automata Theory?
Study of abstract computing devices, ormachines
Automaton = an abstract computing device
Note:A device need not even be a physicalhardware!
A fundamental question in computer science: Find out what different models of machines can do
and cannot do
The theory of computation
sumalatha
8/13/2019 atcd unit 1
4/58
Alan Turing (1912-1954)
Father of Modern ComputerScience
English mathematician
Studied abstract machines
called Tur ing mach ineseven before computersexisted
sumalatha
8/13/2019 atcd unit 1
5/58
The Central Concepts of
Automata Theory Alphabet
Strings (words)
Language
sumalatha
8/13/2019 atcd unit 1
6/58
Alphabet
An alphabet is a finite, non-empty set ofsymbols
We use the symbol (sigma) to denote an
alphabet Examples:
Binary: = {0,1}
All lower case letters: = {a,b,c,..z}
Alphanumeric: = {a-z, A-Z, 0-9}
sumalatha
8/13/2019 atcd unit 1
7/58
Strings
A string or word is a finite sequence ofsymbols chosen from
Ex: The string 01011 is formed over analphabet { 0,1 }
Empty st r ing is (or epsilon)
Length of a string w,denoted by |w|, is equalto the number of (non-) characters in the string
E.g., x = 010100 |x| = 6
x = 01 0 1 00 |x| = ?
sumalatha
8/13/2019 atcd unit 1
8/58
Language
A language, L, is simply any set of strings over a fixed alphabet.
Alphabet Languages
{0,1} {0,10,100,1000,100000}
{0,1,00,11,000,111,}{a,b,c} {abc, aabbcc,
aaabbbccc,}
{A, ,Z} {TEE,FORE,BALL,}
{FOR,WHILE,GOTO,}
{A,,Z,a,,z,0,9, { All legal C programs}
+,-,,,}
Special Languages: - EMPTY LANGUAGE- contains string only
sumalatha
8/13/2019 atcd unit 1
9/58
Regular Expressions
A declarative way to express the pattern of any string
over an alphabet
or
Regular expressions are an algebraic way to describelanguages.
If Eis a regular expression, then L(E)is the language it
defines.
A Language denoted by a regular expression is said tobe
regular set.
sumalatha
8/13/2019 atcd unit 1
10/58
Operation on these two languages are:1. unionof L and D written L D: is the set of
letters and digits
2. con catenat ion of L and D written LD :is the set of
strings consisting of letter followed by a digit.3. L4 is the set of all four letter string
4. Kleene clos ureof L written L*:is the set of all
strings of letters, including the empty string ()
(denotes zero or more concatenations of L)
5. pos i t ive c losureof D written D+: D+is the set of all
strings of one or more digits. (denotes one or more
concatenations of D)
Ex: Let L be the set consists of alphabetsLet D be the set consists of digitsL={ A,B,Z , a, b,...z}D = {0,1,..9}
sumalatha
8/13/2019 atcd unit 1
11/58
Regular Expressions
Let = { a , b}
1. The regular expression (a+b) denotes following set {a, b}
2. (a+b)(a+b) = { aa, ab ,ba ,bb}
3. a*= {, a, aa ,aaa , ..}
4. (a+b)* = {, a , b ,ab, ba ,aaa ,bbb, aba ,aab ,bbabb }
5. (ab)* = {,ab ,abab, ababab}
6. a*b*= {,a,b,aaaa, bbb, aaabbbb,abbb,aaaaab..}
7. a*|b* = {, aaaa , bbbbbbb}
8. a+= {a , aa, aaa , ..}
9. a(a+b)* denotes any no. of as and any no. of bs but the string
start with asumalatha
8/13/2019 atcd unit 1
12/58
Examples
1. 01* = {0, 01, 011, 0111, ..}
2. (01*)(01) = {001, 0101, 01101, 011101, ..}
3. (0+1)*
4. (0+1)*01(0+1)*
5. ((0+1)(0+1)+(0+1)(0+1)(0+1))*
6. ((0+1)(0+1))*+((0+1)(0+1)(0+1))*
7. (1+01+001)*(+0+00)
sumalatha
8/13/2019 atcd unit 1
13/58
Regular Expression
Examples
1. All Strings that start with tab or end with bat:
tab{A,,Z,a,...,z}*|{A,,Z,a,....,z}*bat
2. All Strings in Which Digits 1,2,3 exist in ascending
numerical order:
{A,,Z}*1 {A,,Z}*2 {A,,Z}*3 {A,,Z}*
sumalatha
8/13/2019 atcd unit 1
14/58
Find a regular expressions:
1.The set of bit strings with even length
(00 +01 +10 +11)*
2.Set of bit strings ending with a 0not containing11not the null string
(0 +10)*(0+10) or (0+10)+
3.The set of bit strings containing and oddnumber of 0s
1*01*(01*01*)*
sumalatha
8/13/2019 atcd unit 1
15/58
Following rules are used to
simplifying Regular Expression
1. +R= R
2. R =R3.R+R =R4.R*R*=R*
5.(R*)*=R*
6. + RR*=R*7. (P+Q)*= (P*Q*)* = (P*+Q*)*
8.RR*=R+
sumalatha
8/13/2019 atcd unit 1
16/58
Transition Diagrams
The language is recognized by using diagrammatic representationcalled Transition Diagram.
The Transition diagram is made up of set of statesand transitionsfrom one state to another.
The Transition Diagram(TD) has:
States: Represented by Circles
Transitions(actions): Represented by Arrowsbetween states Start State : Beginning of a pattern (Arrowhead)
Final State(s) : End of pattern (Concentric Circles)
sumalatha
8/13/2019 atcd unit 1
17/58
- transition
Each edge in TD is labeled with i/p character scanned by the machine
The -transition in TD is used to move from one state to next state
without reading any input character.
A B
sumalatha
8/13/2019 atcd unit 1
18/58
1) a+b
2) ab
3) a*
4) (a+b)*
5) (ab)*
q0q1a/b
q0q2q1
a b
q1
a
q1
a/b
q0q2q1
a b
Regular Expression Transition Diagram
sumalatha
8/13/2019 atcd unit 1
19/58
1) a+ =(aa*)
2) a*|b*
3) a*b*
4) a(a+b)*
q0q1a
q1
a
a
q1
q1
q0
b
a q1
a/b
q2
q0
a
b
sumalatha
8/13/2019 atcd unit 1
20/58
Finite AutomataThe generalized transition diagram is called Finite Automata
Formally a Finite Automata is defined as a five tuple set
M={Q,, , q0 F}
Where Q Finite set of states
input symbol (an alphabet)
Transition function specifies from which state on which i/p
symbol ,where the transition goes. It maps
(p , a) = q
where p, q are states
a is i/p symbolq0Initial state
F Set of Final states FQ
sumalatha
8/13/2019 atcd unit 1
21/58
The Finite Automata is classified into two ways
Finite Automata
Finite Automata
Without output
Finite Automata
With output
NFA DFA Mealy Moore
sumalatha
8/13/2019 atcd unit 1
22/58
Finite Automaton without output
(Language Recognizers)
Input
Accept
or
Reject
String
Finite
Automaton
NFA or DFA
Output
sumalatha
8/13/2019 atcd unit 1
23/58
Finite Automata : A recognizer that takes an input string &determines whether its a valid sentence of the
language
Non-Deterministic : Has more than one alternative action for the
same input symbolNon-Deterministic Finite Automata (NFAs)
easily represent regular expression, but are
somewhat less precise.
Deterministic : Has at most one action for a given input
symbol. Deterministic Finite Automata (DFAs)require more complexity to represent regular
expressions, but offer more precision.
Both types are used to recognize regular expressions.
sumalatha
8/13/2019 atcd unit 1
24/58
Non-Deterministic Finite
Automata
An NFA is a mathematical model that consists of :
N= {Q ,, , q0, F}
Q, a set of states
, the symbols of the input alphabet
() a transition function.
(state, symbol) set of states
: Q {} power(Q) (2Q
)A state, q0Q, the start state
F Q, a set of finalor accepting states.
sumalatha
8/13/2019 atcd unit 1
25/58
Representing NFAs
Transition Diagrams :
Transition Tables:
Number of states (circles),
arcs, final states,
More suitable
representation within a
computer
sumalatha
8/13/2019 atcd unit 1
26/58
NFA -Example
Given the regular expression : (a|b)*abb
start0 3b21 ba
a
b
Q = { 0, 1, 2, 3 }
Q0= 0
F = { 3 }
= { a, b }
EXAMPLE:Input: ababb
(0, a) = 0
(0, b) = 0
(0, a) = 1
(1, b) = 2
(2, b) = 3
ACCEPT !
sumalatha
8/13/2019 atcd unit 1
27/58
NFA - Example
32
bca
1
6
7
c
a
c
4 b
a (b*c)
a (b | c+)
Given the regular expression : (a (b*c)) | (a (b | c+))
5
sumalatha
8/13/2019 atcd unit 1
28/58
(a (b*c)) | (a (b | c+))
32
bca
1
6
5
7
c
a
c
4 b
0
Input : abbc
abcc
sumalatha
8/13/2019 atcd unit 1
29/58
Transition table:
A tabular representation can also be used to represent the finite automata iscalled transition table. In Transition table there is a row for each stateand
column for each input symbol
The entry for row i and symbol ain the table is a set of states that can be
reached by transition from state ion input a.
q1q0
0/1
0
1
states Input symbol
0 1
q0 {q0 ,q1} q0
q1 q1
Transition Table
sumalatha
8/13/2019 atcd unit 1
30/58
An NFA accepts an input string x if and only if there is some path in
the transition diagram from the initial state to final state.
q0
0/1
0
1
q1 q20
1
String : 001001
q0 q1 q20 0
q0 q1 q1 q2
q0 q0 q1 q2 q2
q0 q1 q1
q0 q0
0
0
0
0 0 0
0
0
0
1
1
1
1
1 accepted
sumalatha
8/13/2019 atcd unit 1
31/58
Thomsons construction of an NFA from
a Regular Expression
start i f
1. The NFA for
astart i f
2. NFA for a input symbol
sumalatha
8/13/2019 atcd unit 1
32/58
3. NFA for R1R2 where R1, R2 are two regular expressions
R1start i
R2f
3. NFA for R1|R2 where R1, R2 are two regular expressions
i f
R1
R2
sumalatha
8/13/2019 atcd unit 1
33/58
5. NFA for R*
Rstart i f
sumalatha
8/13/2019 atcd unit 1
34/58
R1
R2
i f
6. NFA for (R1|R2)*
sumalatha
8/13/2019 atcd unit 1
35/58
The NFA for (a|b)*abb using Thomsons construction
1
2 3
54
6
a
b
2a 3
4b
5
1. r1 = a
2. r2 = b
3. r3 = r1|r2 i.e a|b
sumalatha
8/13/2019 atcd unit 1
36/58
7 9 10a b b
8
5. r5 = abb
0 1
2 3
54
6 7
a
b
start
4. r4 = (r3)* i.e (a|b)*
sumalatha
8/13/2019 atcd unit 1
37/58
0 1
2 3
54
6 7 8 9
10
a
a
b
b
b
start
6. r6= r4r5 i.e : (a|b)* abb
sumalatha
8/13/2019 atcd unit 1
38/58
Regular Expression : (a|b)*abb
NFA without moves
NFA with moves
start0 3b21 ba
a
b
0 1
2 3
54
6 7 8 9
10
a
a
b
b
b
start
sumalatha
8/13/2019 atcd unit 1
39/58
Deterministic Finite
Automata
A DFAis defined as Five tuple set with the following properties
D= {Q ,, , q0, F}
Where transition function defines mapping from QQ
i.e in DFA
i) No state has an - transitionii) There exists only one transition from a state on same input symbol
sumalatha
8/13/2019 atcd unit 1
40/58
DFA simulated as
Q= { q0 , q1}
={0,1}
q0 ={q0}
F={ q1}
(q0,0)q1
(q0,1)q0
(q1,0)q1
(q1,1)q0
q0 q1
0
1
1 0
sumalatha
8/13/2019 atcd unit 1
41/58
Example -DFA
start
0 3b21 ba
b
ab
aa
Regular Expression : (a|b)*abb
DFA:
sumalatha
8/13/2019 atcd unit 1
42/58
Conversion of an NFA (without moves) into equivalent DFA
For every NFA there exists an equivalent DFA.Consider an NFA for the regular expression (a/b)*abb
start q0q3
bq2q1 ba
a
b
Q ={ q0 ,q1 ,q2 ,q3}
= {a,b}
q0 = {q0}
F= {q3}
a b
q0
q1q2
q3
{q0,q1}
q0
q2q3
Transition Table
sumalatha
8/13/2019 atcd unit 1
43/58
a b
q0
q1
q2
q3
{q0,q1}
{q0,q2}
{q0,q3}
{q0,q1}
{q0,q1}
{q0,q1}
{q0,q1}
q0
q2
q3
{q0,q2}
{q0,q3}
q0
The DFA transition table DFA Transition Diagram
q0 q0,q1
q2q1 q3
q0,q2 q0,q3
b
b
b
b
b b
a
a
a
a
This part is eliminated
sumalatha
8/13/2019 atcd unit 1
44/58
-closure(s) : set of NFA states reachable from NFA state son transitionsalone.
Move(T,a): set of NFA states to which there is a transition on input symbol afrom
some NFA state s in T
1
2 3
54
6
a
b
-closure(1): { 1 , 2 ,4}-closure(3) : {3, 6}-closure(3 ,5) : {3 ,5 , 6 }
move(2 ,a) : {3}Move( {1 ,2 ,3} , a) : { 3}
NFA (with moves) to DFA
sumalatha
NFA (with moves) to DFA
8/13/2019 atcd unit 1
45/58
NFA (with moves) to DFA(subset construction)
First we calculate: -closure(0) (i.e., state 0)
-closure(0) = {0, 1, 2, 4, 7} (all states reachable from 0on -moves)
Let A={0, 1, 2, 4, 7} be a state of new DFA, D.
0 1
2 3
54
6 7 8 9
10
a
a
b
b
b
start
Start with NFA: R.E: (a | b)*abb
sumalatha
8/13/2019 atcd unit 1
46/58
b: -closure(move(A,b)) = -closure(move({0,1,2,4,7},b))
adds {5} ( since move(4,b)=5)
From this we have : -closure({5}) = {1,2,4,5,6,7}(since 56 1 4, 6 7, and 1 2 all by -moves)
Let C={1,2,4,5,6,7} be a new state. Define Dtran[A,b] = C.
2nd, we calculate : a : -closure(move(A,a)) andb : -closure(move(A,b))
a: -closure(move(A,a)) = -closure(move({0,1,2,4,7},a))}adds {3,8}( since move(2,a)=3 and move(7,a)=8)
From this we have : -closure({3,8}) = {1,2,3,4,6,7,8}(since 36 1 4, 6 7, and 1 2 all by -moves)
Let B={1,2,3,4,6,7,8} be a new state. Define Dtran[A,a] = B.
sumalatha
8/13/2019 atcd unit 1
47/58
3rd, we calculate for state B on {a,b}
a: -closure(move(B,a)) = -closure(move({1,2,3,4,6,7,8},a))}= {1,2,3,4,6,7,8} = BDefine Dtran[B,a] = B.
b: -closure(move(B,b)) = -closure(move({1,2,3,4,6,7,8},b))}= {1,2,4,5,6,7,9} = D
Define Dtran[B,b] = D.
4th, we calculate for state C on {a,b}
a: -closure(move(C,a)) = -closure(move({1,2,4,5,6,7},a))}= {1,2,3,4,6,7,8} = B
Define Dtran[C,a] = B.
b: -closure(move(C,b)) = -closure(move({1,2,4,5,6,7},b))}= {1,2,4,5,6,7} = C
Define Dtran[C,b] = C.
sumalatha
8/13/2019 atcd unit 1
48/58
5th, we calculate for state D on {a,b}
a: -closure(move(D,a)) = -closure(move({1,2,4,5,6,7,9},a))}= {1,2,3,4,6,7,8} = BDefine Dtran[D,a] = B.
b: -closure(move(D,b)) = -closure(move({1,2,4,5,6,7,9},b))}= {1,2,4,5,6,7,10} = E
Define Dtran[D,b] = E.
Finally, we calculate for state E on {a,b}
a: -closure(move(E,a)) = -closure(move({1,2,4,5,6,7,10},a))}= {1,2,3,4,6,7,8} = B
Define Dtran[E,a] = B.
b: -closure(move(E,b)) = -closure(move({1,2,4,5,6,7,10},b))}= {1,2,4,5,6,7} = C
Define Dtran[E,b] = C.
sumalatha
8/13/2019 atcd unit 1
49/58
DstatesInput Symbol
a b
A B C
B B D
C B C
E B C
D B E
A
C
B D Estart bb
b
b
b
aa
a
a
This gives the transition table Dtranfor the DFA of:
sumalatha
Mi i i i b f t t i
8/13/2019 atcd unit 1
50/58
Minimizing number of states in
DFA
sumalatha
1.All states in DFA are divided into two groups:Final states
Non final states
2. If for every input symbol a, two states sand tin G have
transitions on a to the same group, then sand tstay in the
same group.
Otherwise, divide Gand put sand tto different groups
3. Repeat the division, until no changes on grouping
8/13/2019 atcd unit 1
51/58
DFA Minimization
(a|b)*abb
A, C, E has
the same transitionsCan be merged
But E is a final
state and A and
C are not
Merge A and C A
A
sumalatha
8/13/2019 atcd unit 1
52/58
DstatesInput Symbol
a b
A B A
B B D
E B A
D B E
A B D Estart bb
bb
aa
a
a
DFA after minimizing the states
sumalatha
Regular expression from
8/13/2019 atcd unit 1
53/58
Regular expression from
finite automata0
To calculate regular expression for this automata will apply following procedure
q0= + q01+q11 ..(1)q1=q00+q10 .(2)
apply ardens theorem
q1=q000*
q1= q00+ (3)
Use (3) in (1)q0= + q01+ q00+1apply ardens theorm
q0 =(1+0+1)* ..(4)
q1 = (1+0+1)*00*
q0 q10
1
1
Arden theorem :
If the equation in the form
R= Q+RP if P doesnt contain
where P ,Q are 2
regular expressions
ThenR=QP*
sumalatha
Fi it t t ith t t
8/13/2019 atcd unit 1
54/58
Finite automata with output
Mealy Machine: In this output is associated with input symbol.
Moore Machine: In this output is associated with state.
Both the machines define a six tuple ( Q ,,,,,q0)
where
Q Set of states
Input symbol
output symbol
transition function Q X Q
output function (Q in Moore and Q X in Mealy)
q0 initial state
Both the machines are deterministic in nature.
No final states.
sumalatha
8/13/2019 atcd unit 1
55/58
Construction of Mealy and Moore machines to generate
the output , same as binary input.
Mealy Machine Moore Machine
sumalatha
q0
0/0 , 1/1
q0/0 q1/11
0
0 1
0NS O/P
1NS O/P
q0 q0 0 q0 1
O/P 0 1
q0 0 q0 q1
q1 1 q0 q1
Design a Mealys Machine which will increment the value of a given binary
8/13/2019 atcd unit 1
56/58
Design a Mealy sMachine which will increment the value of a given binary
number by 1.
Design a MealysMachine to obtain the 2s complement of a given binary
number.
Design a Moore Machine to determine the mod 3 for each binary string
8/13/2019 atcd unit 1
57/58
Design a Moore Machine to determine the mod 3 for each binary string
treated as Binary Integer
In Moore machine if nbit length input is received , produces n+1 bit output
Assignment - 1
8/13/2019 atcd unit 1
58/58
1. Define the following
(a) Alphabet (b) String (c) Language
2. Explain in detail about closed properties and identity rules of Regular sets.
3. Write the regular expressions for the following languages
(a) All strings of lowercase letters that contain the five vowels in order.(b) All strings of lowercase letters in which the letters are in ascending lexicographi
order.
4. Construct NFA with -moves for the following regular expressions.
(i) (11+0)* (00+1)* (ii) 10 + (0+11)0*1 (iii) (a + b)* (aa+bb)(a + b)*
5. Construct Minimum state DFA for the following regular languages.
(i) (0+1)*1 (0+1)*(ii) Let L be the set of all binary strings whose last two symbols are same.
6.Obtain the regular expression for the following Finite Automata.
7. Construct Moore and Mealy machines that accepts all binary strings as input
and produces output y if the string ends with two consecutive symbols of same
type, n otherwise.
8 Prove (a*|b*)* and ((|a)b*)* are equal