90
1 Unit 7 Context-free Languages Reading: Sipser, chapt. 2.1 Hopcroft et. al. chapt. 5

Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

Embed Size (px)

Citation preview

Page 1: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

1

Unit 7

Context-free Languages

Reading:Sipser, chapt. 2.1

Hopcroft et. al. chapt. 5

Page 2: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

2

Grammar

• Another method of describing languages.

• More powerful than NFA.

• Can describe recursive structures.

• A basic tool in compilation theory.

• The computation is performed by creating a string (parsing): – We start with an empty string and create the string

according to the grammar rules until we have the final output.

• The language of the grammar consists of all possible outputs.

Page 3: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

3

The Origin of Grammar

• The origin of the name grammar for this

computational model is in natural languages,

where grammar is a collection of rules.

• This collection defines what is legal in the

language and what is not.

Page 4: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

4

Computational Model for Grammars

• The computational model is a collection of

substitution rules over an alphabet and a

set of variables V.

• Every grammar has a start symbol also called

a start variable (usually denoted by S).

• From the start variable we derive a word

using the substitution rules.

Page 5: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

5

notation

• We use the notation “” in grammar rules.

• It means :

– can be replaced by

– constructs

– produces

Page 6: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

6

Example of a grammar

• ={a,b,c}

• The following grammar generates all strings

over .

SaS (add a)

SbS (add b)

ScS (add c)

S (delete S)

Page 7: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

A

7

Grammar

• A collection of substitution rules of the form:

• The symbol A is a variable.

• The string consists of variables and

terminals ().

• The variable S is the start variable.

Page 8: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

8

Derivation of a word

1. Write down the start variable.

2. Find a variable A that is written down and a

rule A.

3. Replace the variable A with the string .

4. Repeat steps 2+3 until no variables remain.

Page 9: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

9

notation

• We use the notation “” to represent an

actual derivation:

• It means : the string was derived from

using a substitution rule.

Page 10: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

10

Production w=aacb

1. SaS can be written

2. SbS

3. ScS

4. S

• How can the word w=aacb be produced?

aacbaacbSaacSaaSaSS)4()2()3()1()1(

Sa | bS | cS |

Page 11: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

11

Parsing

• What we did is called parsing a word w according to a given grammar.

• To parse a word or a sentence means to break it into parts that conform to a given grammar.

• We can represent the same production sequence by a parse tree or a derivation tree.

• Each node in the tree is either a letter or a terminal.

• A terminal node is a leaf.

Page 12: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

12

Parsing Tree of w=aacbS

Sa

Sa

Sc

Sb

Page 13: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

13

Parsing Tree of w=aacb

Or a step by step derivation:

S

Sa

S S

Sa

Sa

Page 14: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

14

Parsing w=aacb (cont.)

S

Sa

Sa

Sc

Sb

S

Sa

Sa

Sc

Sb

S

Sa

Sa

Sb

Page 15: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

Another example

expr expr + term | term

term term factor | factor

factor ( expr ) | a

• What are the terminals?

• Parse the string: a + a a (is it unique?)

• Parse the string: (a + a) a

Solution: in class 15

Page 16: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

16

Context-Free Grammar (CFG): Formal Definition

A context-free grammar (CFG) G is a 4-tuple

G=(V, , S, R), where

1. V is a finite set called the variables.

2. is a finite set, disjoint from V, called the

terminals.

3. S is a start symbol.

4. R is a finite set of production rules, with each

rule being a variable and a string of variables

and terminals: A, AV and (V)*

דקדוק חסר הקשר

Page 17: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

17

Derivation in CFG

• Let , and be strings of variables and

terminals

• If A is a rule in the grammar, we say that

A derives , written A .

• We write x y if there exists a sequence

x1, x2, ..xk, k0 and x x1 x2 ... y .

means derives in one step

means derives in one or more steps

means derives in zero or more steps

*

*

+

Page 18: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

18

The language of CFG

• The language of the grammar is

L(G) = {w* | S * w}

• The language generated by a CFG is called a

context-free language (CFL). שפה חסרת הקשר

Page 19: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

19

Derivation steps

• In fact,

L(G) = {w* | S + w}

• Because a derivation with zero steps produces

only S.

• S is not a string over *, so it can't belong to L.

Page 20: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

Common Notations

• Terminals: Lower case, lower alphabet (a, b, c).

• Variables: Upper case, higher alphabet (A, B, C).

• String of terminals: Lower case, higher alphabet

(u, v, w).

• Mixed strings (terminals + variables): Lower case,

Greek letters (, , )

• Terminal or variable: Upper case, higher alphabet

(X, Y, Z)

20

Page 21: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

21

Examples over ={0,1}

• Construct a grammar for the following language: L = {0,00,1}

• G = (V={S},={0,1},S, R)

• R: S 0

S 00

S 1

• Alternatively

S 0 | 00 | 1

When a variable has various

production rules, they can all

be written in one line.

Different rules are separated

by the symbol ‘|‘.

Page 22: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

22

Examples over ={0,1}

• Construct a grammar for the following

language L = {0n1n |n0}

• G = (V={S},={0,1},S, R) where R:

S0S1

S

Alternatively

S0S1 |

Page 23: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

23

Examples over ={0,1}

• Construct a grammar for the following

language

L = {0n1n |n1}

• G = (V={S},={0,1},S, R) where R:

S 0S1 | 01

Page 24: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

24

Examples over ={0,1}

• Construct a grammar for the following

language

L = {0*1+}

• G = (V={S,B},={0,1},S, R) where R:

S 0S | 1B

B 1B |

What about 0*1* ?

Page 25: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

25

Examples over ={0,1}

• Construct a grammar for the following

language

L = {02i+1 | i0}

• G = (V={S},={0,1},S, R) where R:

S 0 | 00S

Page 26: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

26

Examples over ={0,1}

• Construct a grammar for the following

language

L = {0i+11i | i0}

• G = (V={S},={0,1},S, R) where R:

S 0 | 0S1

Page 27: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

27

Examples over ={0,1}

• Construct a grammar for the following

language

L = {w| w* and |w| mod 2 = 1}

• G = (V={S},={0,1},S, R) where R:

S 0 | 1| 1S1| 0S0 |1S0 | 0S1

Let‟s parse: 011100101

Page 28: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

28

Examples over ={0,1}

• Construct a grammar for the following language

L = {0n1n |n1} {1n0n | n0}

• G = (V={S,A,B},={0,1},S, R) where R:

S A | B

A 0A1 | 01

B 1B0 |

Page 29: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

29

Exercise

Construct grammars for the following languages over ={0,1}

1. L1= {w | #1(w) is even}

2. L2= {w | #1(w) is odd}

3. L3= {w| #1(w) = #0(w)}

4. L4= {0n10m10n+m | n,m 0}

Solution: In class

Page 30: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

30

From a Grammar to a CFL

• Give a description of L(G) for the following

grammar:

S 0S0 | 1

• L(G) = {0n10n | n0}

Page 31: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

31

From a Grammar to a CFL

• Give a description of L(G) for the following

grammar:

S 0S0 | 1S1 |

• L(G) = {The even-length palindromes over

={0,1}}

or

• L(G) = {wwR| w*}

Page 32: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

32

From a Grammar to a CFL

• Give a description of L(G) for the following

grammar:

S 0A | 0B

A1S

B1

• L(G) = {(01)n |n1 }

• Simpler version S 01S | 01

Page 33: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

33

From a Grammar to a CFL

• Give a description of L(G) for the following

grammar:

S 0S11 | 0

• L(G) = {0n+112n |n0 }

Page 34: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

34

From a Grammar to a CFL

• Give a description of L(G) for the following grammar:

S E | NE

N D | DN

D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7

E 0 | 2 | 4 | 6

• L(G) = {w | w represents an even octal number}

Page 35: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

35

From a Grammar to a CFL

• Give a description of L(G) for the following

grammar:

S N.N | -N.N

N D | DN

D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

• L(G) = {w | w represents a decimal rational

number (that has a finite representation) }

Page 36: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

36

Exercise

Give a description for L(G) for each of the following

grammars over ={a,b,$} :

G1: S aSb | A

A Aa |

G2: S aSb | SS |

G3: S aSa | bSb | aS | bS | $

Solution: In class

Page 37: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

37

Exercise

Give a description for L(G) for the following

grammars over ={a} :

EE+E | E*E | T

T0|1|2|..|9

• Let‟s parse the string 3+4*5

– E E+E T+E 3+E 3+E*E * 3+4*5

– E E*E E*T E*5 E+E*5 * 3+4*5

Page 38: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

38

E

EE

3

+

EE *

4

E

5

EE *

EE +

43

T

T

5

T

T T

T

• The string 3+4*5 can be produced in several ways:

Exercise (cont.)

EE+E | E*E | T

T0|1|2|..|9

Page 39: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

39

• So if we use this grammar to produce a

programming language then we will have

several computations for 3+4*5.

• There is no precedence of „‟ over the „+‟.

• This language will be impossible to use

because the user won't know which

computation the compiler uses.

• Two possible results: 35 or 23.

Exercise (cont.)

Page 40: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

40

Ambiguity • The ability of grammar to generate the same

string in several ways is called ambiguity.

• A grammar is ambiguous if there exists a

string w that can be derived by at least two

different parse trees.

• Sometimes it is possible to find for an

ambiguous grammar an unambiguous one

defining the same language.

• Some CFL are inherently ambiguous

– Example: {aibjck | i=j or j=k}

Page 41: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

41

Finite Languages

Theorem: Any finite language cab be

constructed by a CFG.

Proof:

• Let L={wi | in and wi*} be a finite

language over .

• We construct the following grammar:

Sw1

..

Swn

Page 42: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

42

Regular Languages

• Question: Are the regular languages cab be

constructed by CFG?

• Answer: in the following.

Page 43: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

43

The Regular Grammar

A grammar is called regular if each

production has one of the following forms:

Aw

or

AwB

where w* and A,BV.

Page 44: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

44

Regular grammar example

Example

S 012

S 0A

A 0A

A 0

Page 45: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

The Regular Grammar

• Theorem: The set of languages that have a

regular grammar is the set of regular

languages.

• Proof: Soon

• Idea: Given an NFA we will create an

equivalent regular grammar. Given a regular

grammar we will build an equivalent NFA.

45

Page 46: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

The Regular Grammar

• Conclusion: The regular languages is a

proper subset of the context-free languages.

46

Page 47: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

47

Examples of regular grammars

Construct a regular grammar for the following

regular expressions:

L1= 0*

Regular grammar:

S 0S |

L2= (0+1)+

Regular grammar:S 0S | 1S | 0 | 1

L3= 0*+1*

Regular grammar:

S A | B

A 0A |

B 1B |

L4= (01)+

Regular grammar:S 01S | 01

Page 48: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

48

From NFA toRegular Grammars

Lemma: A regular grammar can be

constructed for any NFA.

The basic idea (no proof):

Translation of transition functions of an

automaton to rules in a regular grammar.

Page 49: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

49

Algorithm: from NFA to RG

1. Rename all states of NFA to a set of capital letters.

2. Name the start state of the NFA S.

3. Translate each transition

(A,)=B into the rule AB

and

(A,)=B into the rule AB.

4. Add the rule A for each accepting state A in the NFA.

Page 50: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

50

Example:

Denote q0 by S and q1 by A

The regular grammar is:

S 0S | 0A | 1A |

A 0A | 1S

q0q1

0,1

1

0 0

Page 51: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

51

Example:

Denote q0 by S and q1 by A

The regular grammar is:

S 0S | 0A | 1A |

A 0A | 1S

S A

0,1

1

0 0

Page 52: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

52

From RG to NFA

Lemma: A NFA can be constructed for any

regular grammar G.

The basic idea (no proof):

Construction of an NFA that accepts the

language of the given regular grammar.

Page 53: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

53

Algorithm: from RG to NFA

1. Transform all rules of the grammar to be in a

simple regular form:

Ac or

AcB

where c and A,BV

(this can be done by adding variables).

Page 54: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

54

Algorithm (cont.)

2. The start state of the NFA is the

grammar's start symbol S.

3. For each rule:

If AcB construct a state transition

from A to B labeled c.

If AB construct a state transition

from A to B labeled .

A Bc

A B

Page 55: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

55

4. For each rule Ac , c , add a new

accepting state F and construct a state

transition from A to the new state F labeled c.

Algorithm (cont.)

Ac

F

Page 56: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

56

Example:

Input:

S 0S | 11A

A 1A | 0

New grammar:

S 0S | 1B

B 1A

A 1A | 0

Resulting NFA:

S1

FB A1

1

0

0

Page 57: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

57

Today’s Topics:

• Operations over Grammars:- Union- Concatenation- Kleene Star

• Simplified Grammars

• Chomsky Normal Forms

• Chomsky Hierarchy

Page 58: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

58

Operations over GrammarsWe can perform the following operations over

grammars:

1. Union

2. Concatenation

3. Kleene star

Corollary: The context-free languages are closed under the above operations.

Note: CFL are not close under complement or under intersection (proof: in class).

(Are regular grammars closed under complement / intersection?)

Page 59: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

59

Union

Given two languages and their grammars:

• L1 with G1= (V1,1,S1,R1) and

• L2 with G2 = (V2,2,S2,R2)

Such that V1V2 = ,

we construct their union by merging their grammars:

G = (V1V2{S}, 12, S, R1R2{SS1|S2})

Proof idea: The rule SS1 | S2 enables a string w to be derived either from S1 or from S2.

Page 60: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

60

Concatenation

Given two languages and their grammars:

• L1 with G1= (V1,1,S1,R1) and

• L2 with G2 = (V2,2,S2,R2)

Such that V1V2 =

we construct their concatenation :

Gcon = (V1V2{S}, 12, S, R1R2{SS1S2})

Proof idea: The rule SS1S2 enables the creation of a string w=uv where u can be derived from S1 and v from S2.

Page 61: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

61

Kleene starGiven a language L1 and its grammar:

G= (V1,1,S1,R1),

then G* is the grammar for L1*:

G* = (V1 {S}, 1, S, R1 {SS1S | })

Proof idea:

• The rule SS1S means that a word w in L(G*) is built of two parts w=uv such that u is derived from S1 and v is derived from S.

• The rule S means a final derivation of S or derivation of the string.

Page 62: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

62

Simplified Grammars

• Every context-free grammar can be rewritten

in a simplified form.

• A simplified form of the grammar is a

grammar that

– doesn't have rules

and

– doesn't have unit rules.

• An rule is a rule of the form: A.

• A unit rule is a rule of the form: AB.

Page 63: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

Step 1: Removing rules

• A context-free language that does not contain

can be written without rules.

• If L then remove from L.

• Build a simplified form CFG without rules.

• Add a rule S'S | .

63

Page 64: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

64

Algorithm for removing rules

1. Find an rule A (A S) and remove it from R.

2. For each rule in R of the form BA where

,(V)*, add to R the rule B.

– Note: We do so for each occurrence of A, e.g.

for BAA we add BA | A | .

– Note: For a rule BA, we add a new rule B

unless this rule has already been removed

through this process.

3. Repeat from step 1 until we eliminate all rules.

Page 65: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

65

S aBBAC

A

B

Removing A

S aBBAC | aBBC

B

Removing B

S aBBAC | aBBC

S aBAC | aAC | aBC | aC

Example 1:

Page 66: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

66

S aAB

AaA | B

BbB |

Removing B

S aAB | aA ; AaA | B | ; BbB | b

Removing A

S aAB | aA | aB | a

AaA | B | a

BbB | b

Example 2:

Page 67: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

67

S aA

A

Removing A

S aA | a

• It is obvious that the first rule can‟t be used to

derive any word, so it can be deleted.

• The minimized grammar is: S a

Example 3:

Page 68: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

68

Example 4:

S aBaC

B bB | C

C cC |

Delete C

S aBaC | aBa

B bB | C |

C cC | c

Page 69: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

69

Example 4 (cont.):

S aBaC | aBa

B bB | C |

C cC | c

Delete B

S aBaC | aBa | aaC | aa

B bB | C | b

C cC | c

Page 70: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

70

Step 2: Removing Unit Rules

A context-free grammar that contains unit rules can be rewritten without unit rules.

Algorithm for removing unit rules:

1. For each unit rule AB , remove this rule from R and add all productions of B to A:

For each B in R add the rule A

2. Repeat step 1 until all unit rules are removed.

Page 71: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

71

Example:

SA | b

A B | b

B bB | a

First we will eliminate AB unit rule:

SA | b

AbB | a | b

B bB | a

Page 72: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

72

Example (cont.):

SA | b

AbB | a | b

B bB | a

Next we will eliminate SA unit rule:

SbB | a | b

AbB | a | b

B bB | a

Page 73: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

73

Example (cont.):

SbB | a | b

AbB | a | b

B bB | a

• A is not reachable from S and can be

removed. The resulting grammar is:

SbB | a | b

B bB | a

Page 74: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

74

Chomsky Normal Form

• Any context-free grammar can be written in a

special form called Chomsky Normal Form

(CNF).

Definition

A CFG is in CNF if every rule is of the form:

ABC or

A

where , A,B,CV, and B,CS.

• If a language contains then the S is allowed.

Page 75: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

Noam Chomsky (from Wikipedia)

An American linguist, philosopher, cognitivescientist, political activist, author, andlecturer. He is an Institute Professor andprofessor emeritus of linguistics at theMassachusetts Institute of Technology.

75

Chomsky is well known in the academic and scientificcommunity as one of the fathers of modern linguistics. Sincethe 1960s, he has become known more widely as a politicaldissident, an anarchist, and a libertarian socialistintellectual. Chomsky is often viewed as a notable figure incontemporary philosophy.

Page 76: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

76

Chomsky Normal Form (cont.)

A grammar in Chomsky Normal Form has several

properties and usages:

– Any string of length n can be derived in 2n-1

steps.

– The parsing tree is a binary tree.

S

B

a

A

DC

Page 77: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

77

Chomsky Normal Form (cont.)

Theorem: Any context free languages can be

generated by CNF grammar.

Converting CFG to CNF

1. Add a new start symbol S' and the rule S'S

to CFG.

2. Remove all -rules.

3. Remove all unit rules.

Page 78: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

78

4. Convert all remaining rules into a proper

form:

– 4.1 Replace each terminal in a rule whose right-

hand side has two or more symbols with variable

A and add a rule A to CFG.

– 4.2 For each rule of the form AB1B2..Bn where

n2 replace it with the two following rules:

AB1C and C B2..Bn

– 4.3 Repeat step 4 until all rules have the proper

form (right-hand side of length2).

Page 79: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

79

Example

Write the following grammar in CNF.

S A | 0B0

A S | 1

B A | 0

Page 80: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

80

Example (cont.)

• add an S'S rule.

• There are no -rules.

• Unit rules are SA, AS, BA, S'S.

• We start from AS

A 1 (old rule)

A A | 0B0 (new rules : the rule AA has

no meaning) so we leave A 0B0 | 1

Page 81: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

81

• Next, eliminate BA.

B 0 (old rule)

B 0B0 | 1 (new rule)

together: B 0B0 | 1 | 0

Example (cont.)

Page 82: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

82

• Next, eliminate SA.

S 0B0 (old rule)

S 0B0 | 1 (new rule)

together: S 0B0 | 1

• Next, eliminate S'S.

S' 0B0 | 1 (new rule)

Example (cont.)

Page 83: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

83

• Throw away the rules of S and A since S and A are not reachable from S'.

S'1 | 0B0

B 0B0 | 1 | 0

• Write all rules in Chomsky form:

S'1 | S0BS0

B S0BS0 | 1 | 0

S0 0

Example (cont.)

Page 84: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

84

• Replace the S0BS0 right side with S0C and

CBS0

• The resulting final grammar is:

S'1 | S0C

B S0C | 1 | 0

CBS0

S0 0

Example (cont.)

Page 85: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

Chomsky Hierarchy

Chomsky hierarchy consist of 4 types of

grammars:

1. Regular (type 3)

2. Context-free (type 2)

3. Context-sensitive (type 1)

4. Recursively enumerable (type 0)

85

Page 86: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

Chomsky Hierarchy (cont.)

Regular grammars:

– Restricted to rules as:

Sa or SaB

where a and S,BV

(different from our definition - a*)

• Generates regular languages.

• Can be decided by a FA .

86

Page 87: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

Chomsky Hierarchy (cont.)

Context-free grammars:

– Restricted to rules as:

A

where AV and (VU)*

• Generates context-free languages.

• Can be decided by a pushdown automaton

(PDA).

87

Page 88: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

Chomsky Hierarchy (cont.)

Context-sensitive grammars:

– Restricted to rules as:

α A β α γ β

AV and α , β, γ (VU)*

• Generates context-sensitive languages.

• Can be decided by a linear-bounded

nondeterministic Turing machine BTM.

88

Page 89: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

89

Example:

L={anbn| n>=1}

S aSBC | abC

CB BC

bB bb

bC b

S

a S B C

a a b C B C

a a b B C

a a b b C

a a b b

Page 90: Unit 5 - PowerPoint 7 Context-free Languages ... S 0S0 | 1S1 | ... •Simpler version S 01S | 01. 33

Chomsky Hierarchy (cont.)

Recursively enumerable grammar:

– No restrictions on rules

• Generates recursively enumerable languages RE.

• Can be decided by a Turing machine.

90