33
Properties of Context-Free Languages BİL405 - Automata Theory and Formal Languages 1

Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Properties of Context-Free Languages

BİL405 - Automata Theory and Formal Languages 1

Page 2: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Properties of Context-Free Languages

• Simplication of CFG's. This makes life easier, since we can claim that if a language is CF, then it has a grammar of a special form.

• Pumping Lemma for CFL's. Similar to the regular case.

• Closure properties. Some, but not all, of the closure properties of regular languages carry over to CFL's.

• Decision properties. We can test for membership and emptiness, but for instance, equivalence of CFL's is undecidable.

BİL405 - Automata Theory and Formal Languages 2

Page 3: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Chomsky Normal Form

BİL405 - Automata Theory and Formal Languages 3

We want to show that every CFL (without ) is generated by a CFG where all productions are of the form

where A,B, and C are variables, and a is a terminal.

This is called Chomsky Normal Form (CNF), and in order to get there we have to

Page 4: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Eliminating Useless Symbols

BİL405 - Automata Theory and Formal Languages 4

A symbol X is useful for a grammar G=(V,T,P,S), if there is a derivation

for a terminal string w. Symbols that are not useful are called useless.

It turns out that if we eliminate non-generating symbols first, and then non-reachable ones, we will be left with only useful symbols.

Page 5: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Eliminating Useless Symbols - Example

BİL405 - Automata Theory and Formal Languages 5

Page 6: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Eliminating Useless Symbols

BİL405 - Automata Theory and Formal Languages 6

We have to give algorithms to compute the generating and reachable symbols of G = (V,T,P,S).The generating symbols g(G) are computed by the following closure algorithm:

Page 7: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Eliminating Useless Symbols

BİL405 - Automata Theory and Formal Languages 7

The set of reachable symbols r(G) of G =(V,T, P,S) is computed by the following closure algorithm:

Page 8: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Eliminating -Productions

BİL405 - Automata Theory and Formal Languages 8

We'll compute n(G), the set of nullable symbols of a grammar G=(V,T,P,S):

Page 9: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Eliminating -Productions - Example

BİL405 - Automata Theory and Formal Languages 9

Page 10: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Eliminating Unit Productions

BİL405 - Automata Theory and Formal Languages 10

A B is a unit production, whenever A and B are variables.Unit productions can be eliminated.Let's look at grammar

It has unit productions E T, T F, and F I

Page 11: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Eliminating Unit Productions

BİL405 - Automata Theory and Formal Languages 11

Page 12: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Eliminating Unit Productions

BİL405 - Automata Theory and Formal Languages 12

Page 13: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Eliminating Unit Productions - Example

BİL405 - Automata Theory and Formal Languages 13

The resulting grammar is equivalent to theoriginal one

Page 14: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Summary – Cleanup Grammar

BİL405 - Automata Theory and Formal Languages 14

Page 15: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Chomsky Normal Form, CNF

BİL405 - Automata Theory and Formal Languages 15

We shall show that every nonempty CFL without has a grammar G without useless symbols, and such that every production is of the form

To achieve this, start with any grammar for the CFL, and

1. “Clean up" the grammar.

2. Arrange that all bodies of length 2 or more consists of onlyvariables.

3. Break bodies of length 3 or more into a cascade of two-variable-bodied productions.

Page 16: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Chomsky Normal Form, CNF

BİL405 - Automata Theory and Formal Languages 16

• For step 2, for every terminal a that appears in a body of length2, create a new variable, say A, and replace a by A in all bodies.Then add a new rule A a.

Page 17: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Example of CNF conversion

BİL405 - Automata Theory and Formal Languages 17

Let's start with the grammar (step 1 already done)

Page 18: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Example of CNF conversion

BİL405 - Automata Theory and Formal Languages 18

Page 19: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Example of CNF conversion

BİL405 - Automata Theory and Formal Languages 19

Page 20: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

The Pumping Lemma for Context-Free Languages

• The "pumping lemma for context-free languages" says that in any sufficiently long string in a CFL, it is possible to find at most two short, nearby substrings, that we can "pump" in tandem.

• That is, we may repeat both of the strings i times, for any integer i, and the resulting string will still be in the language.

• Remember the "pumping lemma for regular languages" says that we can always find one small string to pump.

BİL405 - Automata Theory and Formal Languages 20

Page 21: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

The Pumping Lemma for CFLs –The Size of Parse Trees

• Our first step in deriving a pumping lemma for CFL's is to examine the shape and size of parse trees. – One of the uses of CNF is to turn parse trees into binary trees. – These trees have some convenient properties, one of which we

exploit here.

Theorem 7.17: Suppose we have a parse tree according to a Chomsky-Normal-Form grammar G = (V, T, P, S), and suppose that the yield of the tree is a terminal string w. If the length of the longest path is n, then |w| ≤ 2n-l.

BİL405 - Automata Theory and Formal Languages 21

Page 22: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

The Pumping Lemma for CFLs

Theorem 7.18: (The pumping lemma for context-free languages) Let L be a CFL, then there exists a constant n such that if z is any string in L such that |z| is at least n, then we can write z = uvwxy, subject to the following conditions:

1. |vwx| < n. That is, the middle portion is not too long.

2. vx . Since v and x are the pieces to be "pumped," this condition says that at least one of the strings we pump must not be empty.

3. For all i ≥ 0, uviwxiy is in L. That is, the two strings v and x may be "pumped" any number of times, including 0, and the resulting string will still be a member of L.

BİL405 - Automata Theory and Formal Languages 22

Page 23: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

The Pumping Lemma for CFLs - Proof

PROOF: • Our first step is to find a Chomsky-Normal-Form grammar G for L.

– Technically, we cannot find such a grammar if L is the CFL or {}.

– However, if L = then the statement of the theorem, which talks about a string z in L surely cannot be violated, since there is no such z in .

– Also, the CNF grammar G will actually generate L - {}, but that is again not of importance, since we shall surely pick n > 0, in which case z cannot be anyway.

BİL405 - Automata Theory and Formal Languages 23

Page 24: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

The Pumping Lemma for CFLs - Proof

• Now, starting with a CNF grammar G=(V,T,P,S) such that L(G)=L-{}, let G have m variables.

• Choose n = 2m. • Next, suppose that z in L is of length at least n.

– By Theorem 7.17, any parse tree whose longest path is of length m or less must have a yield of length 2m-1 = n/2 or less.

– Such a parse tree cannot have yield z, because z is too long. – Thus, any parse tree with yield z has a path of length at least m + 1.

BİL405 - Automata Theory and Formal Languages 24

Page 25: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

The Pumping Lemma for CFLs - Proof

• Figure suggests the longest path in the tree for z, where k is at least m and the path is of length k+1.

• Since k≥m, there are at least m+1 occurrences of variables A0, A1,..., Akon the path.

• As there are only m different variables in V, at least two of the last m + 1 variables on the path (that is, Ak-m through Ak, inclusive) must be the same variable.

• Suppose Ai=Aj, where k-m ≤ i < j ≤ k.

BİL405 - Automata Theory and Formal Languages 25

Page 26: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

The Pumping Lemma for CFLs - Proof

• Then it is possible to divide the tree as shown in Figure.

• String w is the yield of the subtree rooted at Aj.

• Strings v and x are the strings to the left and right, respectively, of w in the yield of the larger subtree rooted at Ai.

• Note that, since there are no unit productions, v and x could not both be , although one could be.

• Finally, u and y are those portions of z that are to the left and right, respectively, of the subtree rooted at Ai.

BİL405 - Automata Theory and Formal Languages 26

Page 27: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

The Pumping Lemma for CFLs – Proof

• If Ai = Aj = A, then we can construct new parse trees from the original tree, as suggested in (a).

• First, we may replace the subtree rooted at Ai, which has yield vwx, by the subtree rooted at Aj, which has yield w.

• The reason we can do so is that both of these trees have root labeled A.

• The resulting tree is suggested in (b); it has yield uwy and corresponds to the case i = 0 in the pattern of strings uviwxiy.

• Another option is suggested by (c). There, we have replaced the subtree rooted at Aj by the entire subtree rooted at Ai.

• Again, the justification is that we are substituting one tree with root labeled A for another tree with the same root label.

BİL405 - Automata Theory and Formal Languages 27

Page 28: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

The Pumping Lemma for CFLs – Proof

• The remaining detail is condition (1), which says that |vwx| ≤ n.• However, we picked Ai to be close to the bottom of the tree; that is,

k - i ≤ m. • Thus, the longest path in the subtree rooted at Aj is no greater than

m + 1. • By Theorem 7.17, the subtree rooted at Ai has a yield whose length is

no greater than 2m = n.

Q.E.D

BİL405 - Automata Theory and Formal Languages 28

Page 29: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

The Pumping Lemma for CFLs - Example

Example: Let L be the language {0nln2n | n ≥ 1}. This language is not CFL.• Suppose L were context-free. • Then there is an integer n given to us by the pumping lemma.• Let us pick z = 0nln2n

• z = uvwxy, where |vwx| ≤ n and v and x are not both .• Then we know that vwx cannot involve both 0's and 2’s, since the last 0

and the first 2 are separated by n+1 positions.• We shall prove that L contains some string known not to be in L, thus

contradicting the assumption that L is a CFL.

BİL405 - Automata Theory and Formal Languages 29

Page 30: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

The Pumping Lemma for CFLs - Example

• The cases are as follows:

• Whichever case holds, we conclude that L has a string we know not to be in L. • This contradiction allows us to conclude that our assumption was wrong; L is not a

CFL. Q.E.D.

BİL405 - Automata Theory and Formal Languages 30

Page 31: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Closure Properties of Context-Free Languages

BİL405 - Automata Theory and Formal Languages 31

Substitutions

Theorem 7.24: The context-free languages are closed under thefollowing operations:

1. Union. 2. Concatenation. 3. Closure (*), and positive closure (+). 4. Homomorphism.

Page 32: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Closure Properties of Context-Free Languages

BİL405 - Automata Theory and Formal Languages 32

Reversal:

Intersection With a Regular Language

The CFL's are not closed under intersection.

Page 33: Properties of Context-Free Languagesilyas/Courses/BIL405/...Properties of Context-Free Languages • Simplication of CFG's. This makes life easier,since we can claim that if a language

Closure Properties of Context-Free Languages

BİL405 - Automata Theory and Formal Languages 33