Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Properties of Context-Free Languages
BİL405 - Automata Theory and Formal Languages 1
Properties of Context-Free Languages
• Simplication of CFG's. This makes life easier, since we can claim that if a language is CF, then it has a grammar of a special form.
• Pumping Lemma for CFL's. Similar to the regular case.
• Closure properties. Some, but not all, of the closure properties of regular languages carry over to CFL's.
• Decision properties. We can test for membership and emptiness, but for instance, equivalence of CFL's is undecidable.
BİL405 - Automata Theory and Formal Languages 2
Chomsky Normal Form
BİL405 - Automata Theory and Formal Languages 3
We want to show that every CFL (without ) is generated by a CFG where all productions are of the form
where A,B, and C are variables, and a is a terminal.
This is called Chomsky Normal Form (CNF), and in order to get there we have to
Eliminating Useless Symbols
BİL405 - Automata Theory and Formal Languages 4
A symbol X is useful for a grammar G=(V,T,P,S), if there is a derivation
for a terminal string w. Symbols that are not useful are called useless.
It turns out that if we eliminate non-generating symbols first, and then non-reachable ones, we will be left with only useful symbols.
Eliminating Useless Symbols - Example
BİL405 - Automata Theory and Formal Languages 5
Eliminating Useless Symbols
BİL405 - Automata Theory and Formal Languages 6
We have to give algorithms to compute the generating and reachable symbols of G = (V,T,P,S).The generating symbols g(G) are computed by the following closure algorithm:
Eliminating Useless Symbols
BİL405 - Automata Theory and Formal Languages 7
The set of reachable symbols r(G) of G =(V,T, P,S) is computed by the following closure algorithm:
Eliminating -Productions
BİL405 - Automata Theory and Formal Languages 8
We'll compute n(G), the set of nullable symbols of a grammar G=(V,T,P,S):
Eliminating -Productions - Example
BİL405 - Automata Theory and Formal Languages 9
Eliminating Unit Productions
BİL405 - Automata Theory and Formal Languages 10
A B is a unit production, whenever A and B are variables.Unit productions can be eliminated.Let's look at grammar
It has unit productions E T, T F, and F I
Eliminating Unit Productions
BİL405 - Automata Theory and Formal Languages 11
Eliminating Unit Productions
BİL405 - Automata Theory and Formal Languages 12
Eliminating Unit Productions - Example
BİL405 - Automata Theory and Formal Languages 13
The resulting grammar is equivalent to theoriginal one
Summary – Cleanup Grammar
BİL405 - Automata Theory and Formal Languages 14
Chomsky Normal Form, CNF
BİL405 - Automata Theory and Formal Languages 15
We shall show that every nonempty CFL without has a grammar G without useless symbols, and such that every production is of the form
To achieve this, start with any grammar for the CFL, and
1. “Clean up" the grammar.
2. Arrange that all bodies of length 2 or more consists of onlyvariables.
3. Break bodies of length 3 or more into a cascade of two-variable-bodied productions.
Chomsky Normal Form, CNF
BİL405 - Automata Theory and Formal Languages 16
• For step 2, for every terminal a that appears in a body of length2, create a new variable, say A, and replace a by A in all bodies.Then add a new rule A a.
Example of CNF conversion
BİL405 - Automata Theory and Formal Languages 17
Let's start with the grammar (step 1 already done)
Example of CNF conversion
BİL405 - Automata Theory and Formal Languages 18
Example of CNF conversion
BİL405 - Automata Theory and Formal Languages 19
The Pumping Lemma for Context-Free Languages
• The "pumping lemma for context-free languages" says that in any sufficiently long string in a CFL, it is possible to find at most two short, nearby substrings, that we can "pump" in tandem.
• That is, we may repeat both of the strings i times, for any integer i, and the resulting string will still be in the language.
• Remember the "pumping lemma for regular languages" says that we can always find one small string to pump.
BİL405 - Automata Theory and Formal Languages 20
The Pumping Lemma for CFLs –The Size of Parse Trees
• Our first step in deriving a pumping lemma for CFL's is to examine the shape and size of parse trees. – One of the uses of CNF is to turn parse trees into binary trees. – These trees have some convenient properties, one of which we
exploit here.
Theorem 7.17: Suppose we have a parse tree according to a Chomsky-Normal-Form grammar G = (V, T, P, S), and suppose that the yield of the tree is a terminal string w. If the length of the longest path is n, then |w| ≤ 2n-l.
BİL405 - Automata Theory and Formal Languages 21
The Pumping Lemma for CFLs
Theorem 7.18: (The pumping lemma for context-free languages) Let L be a CFL, then there exists a constant n such that if z is any string in L such that |z| is at least n, then we can write z = uvwxy, subject to the following conditions:
1. |vwx| < n. That is, the middle portion is not too long.
2. vx . Since v and x are the pieces to be "pumped," this condition says that at least one of the strings we pump must not be empty.
3. For all i ≥ 0, uviwxiy is in L. That is, the two strings v and x may be "pumped" any number of times, including 0, and the resulting string will still be a member of L.
BİL405 - Automata Theory and Formal Languages 22
The Pumping Lemma for CFLs - Proof
PROOF: • Our first step is to find a Chomsky-Normal-Form grammar G for L.
– Technically, we cannot find such a grammar if L is the CFL or {}.
– However, if L = then the statement of the theorem, which talks about a string z in L surely cannot be violated, since there is no such z in .
– Also, the CNF grammar G will actually generate L - {}, but that is again not of importance, since we shall surely pick n > 0, in which case z cannot be anyway.
BİL405 - Automata Theory and Formal Languages 23
The Pumping Lemma for CFLs - Proof
• Now, starting with a CNF grammar G=(V,T,P,S) such that L(G)=L-{}, let G have m variables.
• Choose n = 2m. • Next, suppose that z in L is of length at least n.
– By Theorem 7.17, any parse tree whose longest path is of length m or less must have a yield of length 2m-1 = n/2 or less.
– Such a parse tree cannot have yield z, because z is too long. – Thus, any parse tree with yield z has a path of length at least m + 1.
BİL405 - Automata Theory and Formal Languages 24
The Pumping Lemma for CFLs - Proof
• Figure suggests the longest path in the tree for z, where k is at least m and the path is of length k+1.
• Since k≥m, there are at least m+1 occurrences of variables A0, A1,..., Akon the path.
• As there are only m different variables in V, at least two of the last m + 1 variables on the path (that is, Ak-m through Ak, inclusive) must be the same variable.
• Suppose Ai=Aj, where k-m ≤ i < j ≤ k.
BİL405 - Automata Theory and Formal Languages 25
The Pumping Lemma for CFLs - Proof
• Then it is possible to divide the tree as shown in Figure.
• String w is the yield of the subtree rooted at Aj.
• Strings v and x are the strings to the left and right, respectively, of w in the yield of the larger subtree rooted at Ai.
• Note that, since there are no unit productions, v and x could not both be , although one could be.
• Finally, u and y are those portions of z that are to the left and right, respectively, of the subtree rooted at Ai.
BİL405 - Automata Theory and Formal Languages 26
The Pumping Lemma for CFLs – Proof
• If Ai = Aj = A, then we can construct new parse trees from the original tree, as suggested in (a).
• First, we may replace the subtree rooted at Ai, which has yield vwx, by the subtree rooted at Aj, which has yield w.
• The reason we can do so is that both of these trees have root labeled A.
• The resulting tree is suggested in (b); it has yield uwy and corresponds to the case i = 0 in the pattern of strings uviwxiy.
• Another option is suggested by (c). There, we have replaced the subtree rooted at Aj by the entire subtree rooted at Ai.
• Again, the justification is that we are substituting one tree with root labeled A for another tree with the same root label.
BİL405 - Automata Theory and Formal Languages 27
The Pumping Lemma for CFLs – Proof
• The remaining detail is condition (1), which says that |vwx| ≤ n.• However, we picked Ai to be close to the bottom of the tree; that is,
k - i ≤ m. • Thus, the longest path in the subtree rooted at Aj is no greater than
m + 1. • By Theorem 7.17, the subtree rooted at Ai has a yield whose length is
no greater than 2m = n.
Q.E.D
BİL405 - Automata Theory and Formal Languages 28
The Pumping Lemma for CFLs - Example
Example: Let L be the language {0nln2n | n ≥ 1}. This language is not CFL.• Suppose L were context-free. • Then there is an integer n given to us by the pumping lemma.• Let us pick z = 0nln2n
• z = uvwxy, where |vwx| ≤ n and v and x are not both .• Then we know that vwx cannot involve both 0's and 2’s, since the last 0
and the first 2 are separated by n+1 positions.• We shall prove that L contains some string known not to be in L, thus
contradicting the assumption that L is a CFL.
BİL405 - Automata Theory and Formal Languages 29
The Pumping Lemma for CFLs - Example
• The cases are as follows:
• Whichever case holds, we conclude that L has a string we know not to be in L. • This contradiction allows us to conclude that our assumption was wrong; L is not a
CFL. Q.E.D.
BİL405 - Automata Theory and Formal Languages 30
Closure Properties of Context-Free Languages
BİL405 - Automata Theory and Formal Languages 31
Substitutions
Theorem 7.24: The context-free languages are closed under thefollowing operations:
1. Union. 2. Concatenation. 3. Closure (*), and positive closure (+). 4. Homomorphism.
Closure Properties of Context-Free Languages
BİL405 - Automata Theory and Formal Languages 32
Reversal:
Intersection With a Regular Language
The CFL's are not closed under intersection.
Closure Properties of Context-Free Languages
BİL405 - Automata Theory and Formal Languages 33