5
CS340: Theory of Computation Sem I 2015-16 Lecture Notes 9: Chomsky Normal Form Raghunath Tewari IIT Kanpur Normal forms are CFGs whose substitution rules have a special form. Usually normal forms are general enough in the sense that any CFL will have a CFG in that normal form. Normal forms have a nice combinatorial structure which are useful in proving properties about CFLs. 1 Chomsky Normal Form A CFG is said to be in Chomsky Normal Form (in short, CNF) if the following are true. 1. Every rule is of the form - A -→ BC , or - A -→ a, where A,B,C are variables and a is a terminal. 2. The start variable is not present in the right hand side of any rule. 3. The rule S -→ may be present (depending on whether the language has or not). 1.1 Converting a CFG to a grammar in Chomsky Normal Form Let G =(V, Σ,P,S ) be a CFG. Below we give an algorithm to convert G into a CFG in Chomsky Normal Form. 1. Removing S from RHS of rules If S appears on the RHS of some rule, add a new start variable S 0 and the rule S 0 -→ S . 2. Removing -rules Pick an -rule A -→ and remove it (where A is not the start variable). - Now for every occurrence of A on the RHS of some of all rules, add a rule deleting that occurrence of A. If A occurs multiple times on the RHS of a rule, then multiple rules might be added. - If we have the rule R -→ A, then add R -→ , unless the rule R -→ has been removed previously. - Repeat until no more -rules remain, except possibly involving the start variable. Example : Suppose a grammar had the following rules: A -→ B -→ uAv C -→ u 1 Av 1 Aw 1

09-toc

Embed Size (px)

DESCRIPTION

toc lecture

Citation preview

Page 1: 09-toc

CS340: Theory of Computation Sem I 2015-16

Lecture Notes 9: Chomsky Normal Form

Raghunath Tewari IIT Kanpur

Normal forms are CFGs whose substitution rules have a special form. Usually normal forms aregeneral enough in the sense that any CFL will have a CFG in that normal form. Normal formshave a nice combinatorial structure which are useful in proving properties about CFLs.

1 Chomsky Normal Form

A CFG is said to be in Chomsky Normal Form (in short, CNF) if the following are true.

1. Every rule is of the form

- A −→ BC, or

- A −→ a,

where A,B,C are variables and a is a terminal.

2. The start variable is not present in the right hand side of any rule.

3. The rule S −→ ε may be present (depending on whether the language has ε or not).

1.1 Converting a CFG to a grammar in Chomsky Normal Form

Let G = (V,Σ, P, S) be a CFG. Below we give an algorithm to convert G into a CFG in ChomskyNormal Form.

1. Removing S from RHS of rules

If S appears on the RHS of some rule, add a new start variable S0 and the rule S0 −→ S.

2. Removing ε-rules

Pick an ε-rule A −→ ε and remove it (where A is not the start variable).

- Now for every occurrence of A on the RHS of some of all rules, add a rule deleting thatoccurrence of A. If A occurs multiple times on the RHS of a rule, then multiple rulesmight be added.

- If we have the rule R −→ A, then add R −→ ε, unless the rule R −→ ε has beenremoved previously.

- Repeat until no more ε-rules remain, except possibly involving the start variable.

Example: Suppose a grammar had the following rules:

A −→ ε

B −→ uAv

C −→ u1Av1Aw1

Page 2: 09-toc

Then the grammar formed by removing the rule A −→ ε will have the corresponding set ofrules

B −→ uAvC −→ u1Av1Aw1

B −→ uv (rule added)C −→ u1v1Aw1 (rule added with first occurrence of A removed)C −→ u1Av1w1 (rule added with second occurrence of A removed)C −→ u1v1w1 (rule added with both occurrence of A removed)

3. Removing unit rules

Remove a rule A −→ B and for all rules of the form B −→ u add the rule A −→ u, unlessA −→ u is a unit rule that has already been removed. Repeat until no more unit rulesremain.

Example: Suppose a grammar had the following rules:

A −→ B

B −→ u

Then the grammar formed by removing the rule A −→ B will have the corresponding set ofrules

B −→ uA −→ u (rule added)

4. Shortening the RHS

For every rule of the form A −→ u1u2 . . . uk for k ≥ 3, where ui ∈ V ∪ T , replace the rulewith

A −→ u1A1

A1 −→ u2A2

A2 −→ u3A3

···

Ak−2 −→ uk−1uk

Here Ai’s are the new variables added to the grammar.

5. Replacing certain terminals

If there is a rule of the form A −→ uv where at least one of either u or v is a terminalsymbol (say u), then replace the rule A −→ uv with

A −→ UvU −→ u

where U is a new variable added to the grammar. Repeat until no such rules remain.

Page 3: 09-toc

1.2 An Example – CFG to CNF

Consider the following CFG where S is the start variable:

S −→ ASB

A −→ aASA | a | εB −→ SbS | A | bb

We will convert the above grammar into a grammar in CNF. The rules/variables that get addedat each step are shown in bold font.

1. Adding a new start variable S0, since S appears on RHS of some rules.

S0 −→ S

S −→ ASB

A −→ aASA | a | εB −→ SbS | A | bb

2. Eliminating A −→ ε.

S0 −→ S

S −→ ASB | SBA −→ aASA | a | aSA | aAS | aSB −→ SbS | A | bb | ε

3. Eliminating B −→ ε.

S0 −→ S

S −→ ASB | SB | AS | SA −→ aASA | a | aSA | aAS | aSB −→ SbS | A | bb

4. Eliminating the unit rule S −→ S.

S0 −→ S

S −→ ASB | SB | ASA −→ aASA | a | aSA | aAS | aSB −→ SbS | A | bb

5. Eliminating the unit rule B −→ A.

S0 −→ S

S −→ ASB | SB | ASA −→ aASA | a | aSA | aAS | aSB −→ SbS | bb | aASA | a | aSA | aAS | aS

Page 4: 09-toc

6. Eliminating the unit rule S0 −→ S.

S0 −→ ASB | SB | AS

S −→ ASB | SB | ASA −→ aASA | a | aSA | aAS | aSB −→ SbS | bb | aASA | a | aSA | aAS | aS

7. Adding variable U1 and rule U1 −→ AS.

S0 −→ U1B | SB | ASS −→ U1B | SB | ASA −→ aU1A | a | aSA | aU1 | aSB −→ SbS | bb | aU1A | a | aSA | aU1 | aS

U1 −→ AS

8. Adding variable U2 and rule U2 −→ aU1.

S0 −→ U1B | SB | ASS −→ U1B | SB | ASA −→ U2A | a | aSA | aU1 | aSB −→ SbS | bb | U2A | a | aSA | aU1 | aSU1 −→ AS

U2 −→ aU1

9. Adding variable U3 and rule U3 −→ aS.

S0 −→ U1B | SB | ASS −→ U1B | SB | ASA −→ U2A | a | U3A | aU1 | aSB −→ SbS | bb | U2A | a | U3A | aU1 | aSU1 −→ AS

U2 −→ aU1

U3 −→ aS

10. Adding variable U4 and rule U4 −→ Sb.

S0 −→ U1B | SB | ASS −→ U1B | SB | ASA −→ U2A | a | U3A | aU1 | aSB −→ U4S | bb | U2A | a | U3A | aU1 | aSU1 −→ AS

U2 −→ aU1

U3 −→ aS

U4 −→ Sb

Page 5: 09-toc

11. Adding variables V1, V2 and rules V1 −→ a, V2 −→ b.

S0 −→ U1B | SB | ASS −→ U1B | SB | ASA −→ U2A | a | U3A | V1U1 | V1S

B −→ U4S | V2V2 | U2A | a | U3A | V1U1 | V1S

U1 −→ AS

U2 −→ V1U1

U3 −→ V1S

U4 −→ SV2

V1 −→ a

V2 −→ b

Chomsky Normal Form of the given grammar is

S0 −→ U1B | SB | ASS −→ U1B | SB | ASA −→ U2A | U3A | V1U1 | V1S | aB −→ U4S | V2V2 | U2A | U3A | V1U1 | V1S | aU1 −→ AS

U2 −→ V1U1

U3 −→ V1S

U4 −→ SV2

V1 −→ a

V2 −→ b

Exercise 1. Problem 2.14 from textbook.