Toc Unit III

UNIT – IIIPUSH DOWN AUTOMATA

DEFINITIONSØ Every regular language or regular grammar has an equivalent finite state

automaton.Ø This automaton corresponding to a CFL is known as a pushdown

automaton (PDA).Ø The pushdown automaton is a non-deterministic version of the PDA

accepts only the subset of CFL.Ø Hence here the correspondence between the automat and the set of

language is not very satisfactory.

Ø The pushdown automaton consists of a input tape, a finite control and astack. (A first in last out).

Ø The specialty of a stack is that addition (and deletion) of symbols to (orfrom) the stack can be done only at the top of the stack.

Ø Such a stack along with the finite control can be used to recognize non-regular languages.

Ø Consider the CLF.

L = { wc wr| w belong to (a | b) * }

Ø We can show that the language LS is not regular, with the help of pumpingLemma. Further more language L is CLF governed by the CFG

S -> a S a S -> b S b S -> c

Let us construct a PDA with two states q1 and q2. The device will operate bythe following rules.

1) The machine starts with a Z on the stack and with the finite control instate q1.

2) The device is in state q1 i) If input to the device is a then A is pushed in to the stack. ii) If input to the device is b then B is pushed in to the stack.

3) If the input to the device is C and the drive in state q1, then the state is changed to q2.

4) If the device is in the state q2,

i) the input is a and the top of the stack is an A, then A is removed andcontrol remains in q2.

ii) The input is b the top of the stack is a B, then the B is removed andcontrol remains in q2.

5) If the device is in q2 and top of the stack is Z, then the Z is removed fromthe stack.

6) For all cases other than those described above, the device can make no move.

Ø This device accepts an input string if on processing the last symbol of theinput, the stack becomes empty.

Ø The device operates in the following manner. In state q, the devicetransfers the input string to the stack.

Ø When it reads the C then it changes to state q2. In state q2 the devicecompares the remaining input with the symbols in the stack, if they areidentical then the stack is emptied and the input is accepted. In all othercases the input is rejected.

There are two different ways, in which PDA accepts a CFL,

i) Acceptance by empty stack ie., input is accepted if stack is empty after processing the last input symbol

ii) Acceptance by final state ie., input is accepted if the finite control is in final state on processing the last input symbol.

Now we shall formally design a PDA. M is a system (Q, S,G,d,q0,Z0.F) where 1) Q is a finite set of state,

2) Σ is an alphabet called input alphabet,

3) Γ is an alphabet called the stack alphabet,

4) q0 is Q in the initial state,

5) Z0 in Γ is a particular stack symbol called the start symbol,

6) F< Q is the set of final states and

7) D is a mapping from Q x (Σ U {ε}) x Γ to finite subsets of Q x Γ*

MOVESHere the function δ represents the moves of PDA.

Consider δ (q,a,z) = {(p1, γ1), (p2, γ2) ………… , (pm, γm)}

Where q and pi 1 < i < m or states of the finite control, with input symbola and z at the top of the stack.

Ø Then by the function δ moves to the state pi for any i 1 < i < m and replaces zwith the string i.

Ø After this the input head is advanced. This is known as a move.

Now let us design a formal PDA that accepts the language {wcwR| w in (a+b)*}

M = ({q1,q2}, {a,b,c}, {Z,A,B}, δ, q1, Z, a)

δ (q1, a, Z) = {(q1, AZ)}

δ (q1, a, A) = {(q1, AA)}

δ (q1, a, B) = {(q1, AB)}

δ (q2, a, A) = {(q2, ε)}

δ (q2, b, B) = {(q2, ε)}

δ (q2, ε, Z) = {(q2, ε)}

δ (q1, c, Z) = {(q2, Z)}

δ (q1, c, A) = {(q2, A)}

δ (q1, c, B) = {(q2, B)}

δ (q1, b, Z) = {(q1, BZ)}

δ (q1, b, A) = {(q1, BA)}

δ (q1, b, B) = {(q1, BB)}

INSTANTANEOUS DESCRIPTION (ID)Ø Instantaneous description of PDA is used to formally describe the

configuration of that PDA at a particular instant.

Ø The ID contains the state in which the PDA is in at that moment q, thestring of unexpanded input symbol w and the string being held in thestack γ.

Ø Thus we define the IP to be a triple (q, w, γ). If m = (Q, Σ,Γ,δ _T0 ,Z0, F)is a PDA then we say

(q, aw ,z , α) |-- (p, w, βα) if d(q, a, z)contains (p, β) i.e. the ID (p,w, βα) can bereached from the ID (q, aw, Zα) in a single move of PDA.

Ø We use |--* for the kleen closure of |-- (ie.) for every ID I I |-- I and for IdsI,J,K I |-- J and J |-- K implies I |-- K and I |--* J means ID J can bereached from ID I by zero or more number of moves of the PDA.

DETERMINISTIC PDAsA PDA is said to be deterministic if at the most one or move is possible for andID. Formally we say that a PDA is deterministic if

i) For each q in Q and Z in G, whenever δ (q, ε, Z) is non-empty, then δ (q, a, z) is empty for all a in __

ii) For each q in Q, Z in G and a in S U {ε}, d (q, a, Z) can contain at the most only one element

i.e. We say a PDA is deterministic if it does not have a choice of two differentmoves for the same ID.

Ø Here the first condition ensures that there is no choice between a moveindependent on the input tape and a move involving an input symbol a.

The second condition prevents a choice of move for any (q, a, Z).

Ø We know that for a FSA the deterministic and noon deterministic versionswere equivalent.

Ø No extra power is added top the FSAs in adding non-determinism. Alllanguages accepted by a NFA can be accepted by DFA. This is not truefor the PDAs.

Ø A language that is accepted by an non-deterministic PDA need not beaccepted by a deterministic PDA.

Ø In fact the language consisting of all palindromes over alphabet Σ isaccepted by a non-deterministic PDA.

Accepted languages: We have seen that a PDA may accept a string by ending up in one of thefinal states or by ending up with an empty stack. So correspondingly for

PDA M = (Q, Σ,Γ,δ,q0,z0,F) we define L(M), the language accepted byfinal state to be

{w |(q0,w,z0) |--* (p, ε,γ) for some p in F and g in G*}

and we define N(M), the language accepted by empty stack to be

{w | (q0,w,z0) |--* (p, ε, ε) for some p in Q}.

PUSHDOWN AUTOMATA AND CONTEXT-FREE LANGUAGESØ We shall now prove the fundamental result that the class of languages

accepted by PDA's is precisely the class of context free languages.

Ø We first show that the languages accepted by final state are exactly thelanguages accepted by PDA's by empty stack.

Ø We then show that the language accepted by empty stack is exactly thecontext - free languages.

Equivalence of acceptance by final state and empty stack

Theorem If L is L (M2) for some PDA M2 then L is N (M1) for some PDA, M1.

Proofwe would like M1 to simulate M2 enters a final state.

We use state q of M1 to erase the stack and we use a bottom of stack markerXa for M1 does not accidentally accept if M2 empties its stack without enteringa final state.

Let M2 = (Q,∑,T,b, q Zo, F) be a PDA such that L =L(M3 )

Let, M1={QU{qe,q0},∑,ΓΥ{X0},δ',q0,X0, φ),

Where δ’ is defined as follows

1) δ’(q0’, ∈,X0)={(q0,Z0,X0)}

2) δ '(q,a,z ) includes the elements of δ ( q,a,z ) for all q in Q ,

a in ∑ or a= ∈ ,and Z in Γ

3) For all q in F, and in ΓU{Xn}, δ ( q, ∈,z ), contains (q∈,∈ ).

4) For all Z in ΓU{Xn} contains (q∈,∈ ).

Ø Rule (1) causes M1 to enter the initial ID of M2 except that M1 will have itsown bottom of the stack marker Xo , which is below the symbols of M2'sstack.

Ø Rule (2) allows M1 to simulate M2. Should M2 ever enter a final state, rules(3) and (4) allow M1 the choice of entering stae qo and crasing its tack,thereby accepting the input, or continuing to simulate M2.

Ø One should note that M2 may possibly erase its entire stack for some input xnot in L(M2).

Ø This is the reason that M1 has its own special bottom-of-stack marker.

Ø Otherwise M1 in simulating M2 would also erase its entire stack, therebyaccepting x when it should not.

Ø Conversely, if M1 accepts x by empty stack, it is easy to show that thesequence of moves must be one move by rule (1), then a sequence of movesby rule (2) in which M1 simulates acceptance of x by M2 followed by theerasure of M1 s stack using rules (3) and (4) Thus x must be in L(M2).

TheoremIf L is N (M1) for some PDA M1, then L is L (M2) for some PDA M2.

Proof Our plan now is to have M2 simulate M1 and detect when M1 empties itsstack, M2 enters a final state when and only when this occurs.

Let M1=(Q,∑,Γ, δ,θ0,Ζ0, φ)

M2=(Q∪,{ q0,q1},∑ , Γ ∪ {X o},δ,q0,X0,{qf})

Where δ’ is defined as follows:

1) δ’(q’o,∈,Xo) = {{qo,ZoXo)}

2) For all q in Q, a in ∑∪{∈}, and Z in Γ.

δ’(q,a,Z) = δ{q,a,Z)

3) For all q in Q, δ’ (q,∈, Xo) contains (qf,∈),

Ø Rule (1) causes M2 to enter the initial ID of M2.

Ø Except that M2 will have its own bottom of stack marker X0, which isbelow the symbols of M2's stack.

Ø Rule (2) allows M2 to stimulate M1 to stimulate M1 Should M1 evererase its entire stack, then M2 when simulating M1 will erase itsentire stack except the symbol Xo at the bottom.

Ø Rule (3)causes M2when the Xo appears, to enter a final state,thereby accepting the input x.

EQUIVALENCE OF PDA'S AND CFL'S

Theorem: If L is a context-free language, then there exists a PDA M such that L =

N(M)

Proof : we assume that is not in L(G). The reader may modify the constructionfor the case where ∈ is in L (G).

Let G = (V,T,P,S) be a context -free grammar in Grammar in Grelbach normalform generating L

Let M = ({q},T,V,δ,q,S,φ),

Where

δ {q,a,A} contains (q,γ) whenever A→ aγ is in P.

Ø The PDA M simulates left most derivations of G. Since G is in Greibachnormal form,

Ø each sentential form in a leftmost derivation consists of a string of terminals xfollowed by a string of variables α.

Ø M stores the suffix α of the left sentential form on its stack after processingthe prefix x. Formally we show that S xα ⇒ by a leftmost derivation if and only

if (q,x,S)

since x =∈ and α=S, For the induction, suppose i ≥ 1, and let x = ya.

Ø If we remove a from the end of the input string in the first j ID's of thesequence we discover that (q,y,S)

(q, ∈,,β) ,since α can have no effect on the moves of M until it is actuallyconsumed from the input. By the inductive hypothesis S⇒yβ. The move(q,α,β)|-(q,∈α) implies that β = Ay for some A in V, A→ aη an is a production ofG and α=ηy.

HenceS⇒yβ⇒ yaηγ =xα.

Now suppose that Si →xα by a leftmost derivation. We show by induction on i that (q,x,S) *(q,∈,α).The basis, i = 0 ,is again trivial.Let i ≥ 1 and suppose

1 i− →yAγ ⇒ yaηγ,

Where x = ya and α = ηy. By the inductive hypothesis (q,y,S)

(q,∈,Aγ) and thus (q,ya,S)

(q,a,Aγ). Since A→aη is a production,it follows that δ(q,a,A) contains (q,η).Thus(q,x,S)

(q,a,Aγ)-(q,∈,α),and the “only if” portion of (5.1) follows.

Ø To conclude the proof ,we have only to note that (5.1) with α = ∈says S x ⇒ ifand only if (q,x,S)

That is x is in L (G) if and only if x is in N (M).

Theorem:If L is N (M) for some PDA M, then L is a context - free language.

Proof: Let M be the PDA (Q,∑,T,δ,qo,Zo,φ). Let G=(V,∑,P,S) be a context-freegrammar where V is the set of objects of the form (q,A,p} q and p in Q, and A inΓ, plus the new symbol S. P is the set of productions.

for each a, q1,q2,…..qm+1 in Q, each a in Σ ∪ {ε}, and A,B1,B2,….,Bm in Γ,

such that δ(q,a,A) contains (q1,B1,B2…..Bm)

.(If m=0, then the production is [q,A,q1]→a.)

Ø To understand the proof it helps to know that the variables and productions ofG have been defined in such a way that a leftmost derivation in G of asentence x is a simulation of the PDA M when fed the input x.

Ø In particular, the variables that appear in any step of a leftmost derivation in Gcorrespond to the symbols on the stack of M at a time when M has seen as

much of the input as the grammar has already generated.

Ø Put another way, the intention is that [q.A,p] derive x if and only f x causes Mto erase an A from its stack by some sequence of moves beginning in state qand ending in state p.

Ø The string y cn be written y=y1,y2 …..y3 where y1 has the effect of poppingBj from the stack, possibly after a long equence of moves.

Ø That is, let y1 be the prefix of y at the end of which the stack first becomes asshort s n - 1 symbols.

Ø Let y2 be the symbols of y following y1 such that at the end of y2 the stackfirst becomes as short as n-2 symbols, and so on.

Ø The arrangement is shown,Note that B1 need not be the nth stack symbolfrom the bottom during the entire time y1 is being read by M since B3 may bechanged if it is at the top of stack and is replaced by one or more symbols.

Ø However, none of B2 B2 …B3 are ever at the top while y3 is being read andso cannot be changed or influence the computation.

In general , Bj remains on the stack unchanged while y1 y2….yj is read.

Ø The basis j=1, is immediate , since (q,A,p}→x must be a production of G andtherefore δ (q,x,A)must contain (p,}, Note x is ∈ or in ∑ here.

For the induction, suppose

Where q = p. Then we may write x = ax1x2 ….x3 where for with each derivationtaking fewer than j steps.

Ø By the inductive hypothesis If we insert Bj +1 at the bottom of each stack inthe above sequence of ID's we see that From the first step in the derivation ofx from {q,A,p} we know that Is a legal move of M so from this move and(5,4)for =1,2,….n(q,x,A) follows:

Ø The proof concludes with the observation that (5.3) with q = qo and A=Zosays

This observation, together with rule (1) of the construction of G, says thatFor some state p.

That is x is in L(G) if and only if x is in N(M).

3.7 PUMPING LEMMA

Ø We have seen the pumping Lemma corresponding to regular languages. Itwas stated that every sufficiently long string in a regular language contains asubstring that can be pumped.

(i.e) the substring can be repeated as many as times as possible such that theresulting string still lies in the regular languages.

Ø The pumping Lemmar for a CFL states that every long string of a CFLcontains of two short sub-string that can be repeated the same number of timeand still the resulting string lies in the CFL.

The formal statement of the pumping Lemma is as follows:

Let L be any CFL. There is a constant n , depending only on L is in L 1z1 > n,then we may write z = uvwxy such that i) 1vx1 > 1 ii) 1vwx1 < n and iii) u vi w xi y is in l for all I less than or equal to 0

Proof:Let G be a CFG in CNF.

First we show that if the parse tree for a word generated by a CNF grammarhas no path of length greater than I, then the word is of length no greater than2i –1 We use induction oni to prove this.

The case for i = 1 is trivial. We know that every production in a CNF is of theform A -> a or A -> BC

So the tree for I= 1 is trival. Should be of the form S | a

(i.e) for I=1 the length of the word generated is at the most |. So it is true for I=1.Assume that this is true for all trees with no path greater than I-1 can generateno word with length greater than 2i-2.

For the induction step consider the tree S / \ A B T1 T2

If there are no paths of length greater than I-1 then by the assumption thewords generated by the trees T1 and T2 are no longer than 2i-1 (2i-2 * 2).

Hence a word generated by a parse tree with no path greater than ithen theword is no longer than 2i-1.

Let the grammar have K non-terminal symbols and let n = 2k. If ZεΤ∗ is in L(G)and |Z| less than or equal to n, then since|Z| >2k, any parse tree for Z mustcontain at least one path of length K+ 1. Such a path would have at least K+2nodes that are labeled by non-terminals.

But the grammar now only K non-terminals, so there should be at least oneterminal is repeated twice. Let P be a path that is as long or longer than anyother path in the tree. Then there must be two nodes n1 and n2 along P suchthat i) The nodes n1 & n2 both have same label, say A.

ii) The node n1 is closer to the root than n2

iii) The portion of the path from n1 to the leaf is of length at most K+1.

The subtree T1 with root as n1represents the derivation of subtree of lengthgreater than 2K. This is because no path in T1 can be of length greater thanK+1.

Let Z1 be the yield of the sub tree T1 and if T2 is the sub tree with root as noden2 and z2 be the yield of this sub tree.

Now we can write Z1as Z3 Z2 Z4. We can see that Z3 and Z4 cannot be both ε,since the root of sub tree T1 is A and the first production should have been ofthe form A -> BC

And the sub tree T2 should have been derived entirely from B or C.

Now we know that

A z3Az4 and A z2 and

|z3z2z4| < 2K = n.

We can see that

A⇒z3Az4 z3z3Az4z4 Z3 Z3 Z3 A Z4 Z4 Z4 ….

Now Z3Z2Z4 was a substring of Z so

Z = uZ3Z2Z4y and

if

Z2 = w,

Z3 = v and

Z4 = x then

We have uvwxy ε L(G)

uviwxiy ε L(G)

When |uvwxy| > n

|vwx| < n

|vx| > 1

PART A

1. Define PDA.2. Write the Components of PDA.3. Write the formal representation of PDA4. Give the diagrammatic representation of PDA5. What are the three ways to recognize PDA?6. Give informal representation of PDA for the language L= {0n1n| n>=0}7. Give the mathematical model of PDA for the language L= {0n1m|

n>=0,m>=0,m!=n}8. Write instantaneous representation of PDA.9. What is relation between PDA’s and CFL?10. What is relation between NPDA and DPDA11. Write the closure properties of CFL.12. Define Pumping lemma for regular language.13. Show the language L={anbncn:n>=0} is not context free.14. Construct a PDA that accept the language generated by the grammar

SàaSbb|aab.

PART-B

1. Write the mathematical model of the language L={0n1n|n>=0}2. Write the mathematical model of the language L=wcwt

3. Construct PDA for L={anb2n:n>=1}4. Construct PDA for L={anb3n:n>=1}5. Construct a PDA equivalent to the grammar SàaAA AàaS|bS|a6. Construct a PDA equivalent to the grammar SàaAA AàSA|b7. Construct a PDA accepting the following language L={an b2n : n>=1}8. Construct a PDA accepting {anbman :n>=1,m>=1,m!=n} by empty store and

reaching the final state.9. Construct a PDA accepting the following language generated by the grammar

G=({S,A},{a,b},S,P) with the productions SàAA|a, AàSA|b

Documents

Toc Unit III