CFGs are more powerful than regular expressions. …2011/07/01  · Normal Forms for CFG In a...

Preview:

Citation preview

CFGs are more powerful than regular expressions. They are more powerful in the sense that whatever can be expressed using regular expressions can be expressed using context-free grammars, but they can also express languages that do not have regular expressions.

Context-Free Grammar

Context-Free Grammar

Context-Free Grammar

Example:

Context-Free Grammar

Example

Context-Free Grammar

Construct a context free grammar for palindromes containing 0s and 1s.

Context-Free Grammar

Construct a context free grammar for all integers (with sign)

G ={V, Σ, R, S}

V = {S, G, D, I}, Σ = {0, 1, 2, 3,…., 9, +, -}

Derivation Trees

Derivation Trees

Derivation Trees

Derivation Trees

Derivation Tree

Ambiguity

Consider the grammar for arithmetic expressions involving addition and multiplication operators:

Consider the sentence ID+ID×ID.

This can be parsed in two different ways.

Ambiguity

In the figure, the parse tree to the left gives the addition operator precedence over multiplication. In other words, an expression such as 3+5× 9 is evaluated as (3+5) × 9 with a result of 72. Whereas, the tree to the right, does what is considered the standard practice in programming languages, i.e. giving × precedence over +. The previous example would be evaluated as 3+(5× 9) resulting in 48.

A context free grammar G is ambiguous if there exists some w ∈ L(G), which is ambiguous.

If G is the grammar

Show that G is ambiguous.

Ambiguity

To prove that G is ambiguous, we need to find a w ∈ L(G), which is ambiguous . Consider the word abababa.

Ambiguity

Removing Ambiguity

What about a + a × a ?

Removing Ambiguity

Simplification of CFG

If G is a CFG , then we can construct a simplified equivalent CFG G’ with the help of following steps:

1. Eliminate all null productions to get G1.

2. Eliminate all unit productions in G1 to get G2.

3. Construct a reduced Grammar G’ from G2.

Simplification of CFG

1.Elimination of null productions

Let G1=(V, Σ ,P’,S) be the GFG having NO null productions.

Consider the Grammar G whose productions are given below. Construct a Grammar G1 without null productions generating L(G)-{ε}

Simplification of CFG

Step 1: Construction of the set W of all nullable variables

W1={A1∈ V | A1 → ε is a production in P}={A,B}

Wi+1= Wi ∪{K∈ V|there exists a production K → α with α ∈Wi* }

W2={A,B} ∪ {S} as S → AB is a production with AB ∈ W1*

= { S,A,B}

W3= W2 ∪ ∅ = W2

Erasing from RHS

Construction of P’: D → b, S →aS, S → AB , S →a, S → A,

S → B.

Simplification of CFG

2.Elimination of unit productions

A unit production in a context free grammar G is a production of the form A → B where A and B are variables(non terminals) in G.

For such a variable A the following steps have to be applied:

Step1: Construction of the set of variables derivable from A.

W0(A)={A}

Wi+1(A)=Wi(A) ∪ {B ∈ V| C →B is in P with C ∈ Wi(A) }

Simplification of CFG

Step2: Construction of A productions

The A-productions in G2 are either

(a) The non unit production in G1 or

(b) A → α whenever B → α is in G1 with B ∈ W(A) and

α ∉ V.

Now we can define G2, where P2 is constructed using step2 for every A ∈ V

Simplification of CFG

Example:

Let the productions in G1 be

S →AB, A →a, B →C|b, C →D, D →E and E →a

W(B)={B,C,D,E}, W(C)={C,D,E}, W(D)={D,E}

The productions P2 in G2 are:

S →AB, A →a, B →a|b, C →a, D →a and E →a

Simplification of CFG

3.Construction of reduced Grammars:

Many productions in P may not be useful for the purpose of derivation.

It would be better to eliminate (1) variables that do not derive any terminal string and (2) symbols that are not reachable from the start symbol.

If G is a CFG, then we can find an equivalent grammar G’ such that each variable in G’ derives some terminal string.

Let G2=(V, Σ ,P,S). We can define G’ =(V’, Σ ,P’,S) as follows.

Simplification of CFG

a) Construction of V’:

We define Wi ⊆ V by recursion.

W1={A ∈ V| there exists a production A →ω where ω ∈ Σ*}

Wi+1=Wi ∪ {A ∈ V| there exists some production A →α with α ∈(Σ ∪Wi)*}

At some point Wk=Wk+1. Then we get V’=Wk.

b) Construction of P’.

P’={A →α |A, α ∈(V’ ∪ Σ)*}

Simplification of CFG

Example: Let G be

S →AB, A →a, B →b, B →C, and E →c

Find G’ such that every variable in G’ derives some terminal string.

Here we get V’={S,A,B,E}.

So P’= S →AB, A →a, B →b and E →c

Simplification of CFG

If G=(V, Σ ,P,S), we can construct an equivalent grammar G’= (V’, Σ’ ,P’,S) such that every symbol in V’ ∪ Σ’ is derivable from S.

a) Construction of Wi

W1={S}.

Wi+1=Wi ∪ {X∈ (V ∪ Σ) | there exists a production A → α with A ∈ Wi and α containing the symbol X }

At some point Wk=Wk+1.

b) Construction of V’, Σ’, P’.

V’=V∩ Wk , Σ’= Σ ∩ Wk , P’={A → α | A ∈ Wk }

Simplification of CFG

Example: Consider G=({S,A,B,E}, {a,b,c},P,S), where P consists of S →AB, A →a, B →b and E →c.

W3={S,A,B} ∪ {a,b}

V’= {S,A,B} , Σ’ ={a,b}

P’= S →AB, A →a, B →b

Simplification of CFG

Find reduced grammars equivalent to the following

1) S →AB|CA, B →BC|AB, A →a, C →aB|b .

2) S →aAa, A →Sb|bCC|DaA, C →abb|DD, E →aC, D →aDA .

Answers:

1) S →CA, A →a, C →b

2) S →aAa, A →Sb|bCC, C →abb

Normal Forms for CFG

In a context free grammar, the R.H.S of a production can be any string of variables and terminals. When the production in G satisfy certain restrictions, then G is said to be in a ‘normal form’.

1. Chomsky Normal Form(CNF)

In CNF we have restrictions on the length of R.H.S and the nature of symbols in the R.H.S

A CFG G is said to be in CNF, if every production is of the form A→ a, or A→ BC, and S→ ε. When ε is in L(G) we assume that S does not appear on the R.H.S of any production.

Normal Forms for CFG

Example: Consider a CFG whose productions are

S→ AB|ε, A→ a, B→ b

Is it in CNF? YES

Reduction to CNF

Consider an example. Let G be

S→ ABC|aC, A→ a, B→ b, C→ c.

Except S→ ABC|aC , all other productions are in the form required for CNF. The terminal a in aC can be replaced by a new variable D. By adding a new production D →a. So S →aC becomes S →DC.

Normal Forms for CFG

S→ ABC is not in the required form. So it can be replaced by two productions: S→ AE and E→BC.

It is important to note that unit productions and null productions are to be removed before substitutions.

Elimination of terminals on R.H.S: If there is a terminal ai

on the R.H.S of a production, then add a new variable (non terminal) Cai → ai

Restricting the number of variables on R.H.S: Consider productions of the form A →A1A2….Am . We introduce new productions A →A1C1, C1 →A2C2, ……, Cm-2 →Am-1Am

Example1: Reduce the following grammar G to CNF:

S→aAD, A →aB|bAB, B →b, D →d.

Example2: Reduce the following grammar G to CNF:

S→aAbB, A →aA|a, B →bB|b.

Normal Forms for CFG

Answer: S→CaC1, A →CaB|CbC2, C1 →AD, C2 →AB, B →b, D →d, Ca →a, Cb →b.

Answer: S→CaC1, C1 →AC2, C2 →CbB, A →CaA,

B →CbB, Ca →a, Cb →b, A →a, B →b.

Normal Forms for CFG

Greibach Normal Form(GNF):

A grammar is in GNF if every production is of the form

A→ aα , where α ∈ V*. And a ∈ Σ .

S→ ε is in G, when ε is in L(G) and we assume that S does not appear on the R.H.S of any production.

Example: S→ aAB|ε, A →bC, B →b, C →c is in GNF.

Normal Forms for CFG

Lemma1: Let G be a CFG, and A→ Bλ be a production in P. Assume that P also has the following productions: B → β1|β2|….|βs

We can define P1=(P - {A→ Bλ}) ∪ {A→ βiλ|1≤ i ≤ s }

Then G1 having P1 is also a CFG equivalent to G.

Example: Consider G with the following productions:

A → Bab, B → aA|bB|aa|AB

G1 which is equivalent to G can be constructed with the following productions.

A → aAab| bBab|aaab|ABab,

B → aA|bB|aa|AB

Normal Forms for CFG

Lemma2: Let G contains productions of the form

A→ Aα1| Aα2|….| Aαr|β1|β2|….|βs (βi’s do not start with A)

Let Z be a new variable in G1, where P1 is defined as follows:

A→ β1|β2|….|βs

A→ β1Z|β2Z|….|βsZ.

Z→ α1|α2|….|αr

Z→ α1Z|α2Z|….|αrZ

Example: Consider G with the following productions:

A →aBD|bDB|c|AB|AD

Normal Forms for CFG

Then G1, equivalent to G contains the following productions:

A →aBD|bDB|c

A →aBDZ|bDBZ|cZ

Z →B|D

Z →BZ|DZ

Normal Forms for CFG

Reduction to GNF

Step1: Check whether the given grammar is in CNF. If not in CNF, make it in CNF. Rename the variables as A1,A2,….An. With S=A1.

Step2: Ai productions should be of the form Ai→ Ajλ such that i < j. If there are productions of the form Ai → Aiλ, apply Lemma2 to get rid of such productions. Otherwise apply Lemma1.

Step3: Modify Zi productions. Apply Lemma1 to eliminate productions of the form Zi → Akλ.

Normal Forms for CFG

Example1: Construct a grammar in GNF equivalent to the grammar: S→ AA|a, A→ SS|b

Answer: The given grammar is in CNF. Let S be A1 and A be A2.

A1 →A2A2

A1 →a

A2 →A1A1

A2 →b

A1 productions are in the required form. A2 →b is also in the required form.

Normal Forms for CFG

Apply Lemma1 to A2→ A1A1.

We need to apply Lemma2 to A2 →A2A2A1. Let Z2 be the new variable.

Now we can eliminate A1 →A2A2 using Lemma1.

Normal Forms for CFG

We apply Lemma1 to get

Normal Forms for CFG

Normal Forms for CFG

Example2: Convert the following grammar into GNF.

Rename

The productions are :

Normal Forms for CFG

Normal Forms for CFG

Normal Forms for CFG

Recommended