View
39
Download
0
Category
Preview:
DESCRIPTION
SIMPLIFYING GRAMMARS Definition: A useless symbol of a context-free grammar is one which does not occur in the derivation of any sentence of that grammar. For example:G→ RT R→ Ra T→b HereR is useless. . - PowerPoint PPT Presentation
Citation preview
SIMPLIFYING GRAMMARS
Definition: A useless symbol of a context-free
grammar is one which does not occur in the
derivation of any sentence of that grammar.
For example: G→ RT
R→ Ra
T→b
Here R is useless.
Clearly a symbol is useless if and only if either:
a) we cannot derive any string containing it from
the goal symbol and/or
b) we cannot derive a terminal string from that
symbol
Notation: a) is expressed by saying that the symbol
is not reachable from the goal symbol. b) is
expressed by saying that the symbol does not
derive a terminal string
Algorithm: To find all those symbols that are not reachable from the goal symbol.
1) Make a list of all the grammar symbols, all initially
unflagged.
2) Flag the goal symbol.
3) Go through the grammar from the 1st production to the last.
If A→ x1x2…xn is one of these productions and A is
flagged, then flag x1,x2,…,xn (those ones not already
flagged).
4) Where any new symbols flagged during the iteration of
step 3? If so, repeat step 3 again, otherwise stop. Any
symbol that has not been flagged at this stage is not
reachable from the goal symbol.
EXAMPLE
Grammar 1
z → b e
a → a e | e
b → c e | a f
c → c f
d → f
d is not reachable
f
√
f
c
√
c
d is not reachable
Grammar 2
Z → E + T
E → E | S + F | T
F → F | F P | P
P → G
G → G | G G | F
T→ T * i | i
Q → E | E + F | T | S
S → i
Q is not reachable
Q is not reachable
Grammar 3
G → A
Q → P R
P→Q
Q, P, R are not reachable
Algorithm: To determine which symbols do not
derive a terminal string.
1) Make a fresh list of all the symbols, initially
unflagged.
2) Flag all the terminals.
3) Go through the grammar from the first production
to the last. If A→ x1x2…xn is any such production,
then if x1,x2,…,xn are all flagged, flag A.
4) Were any new symbols flagged in step 3? If so, go
back to step 3. If not, all symbols not flagged at
this stage do not derive a terminal string.
TRY THE ABOVE ALGORITHM ON
GRAMMARS 1- 3 ABOVE
Definitions:
1) A α means you can derive α from A or α=A
2) A symbol A is said to vanish if A ε
3) A production of the form χ→ε is called
an ε-production
Note that the textbook uses λ to denote the empty
string, whereas these slides employ ε for this purpose
Algorithm: To determine which symbols of
a grammar vanish.
1) Make a list of symbols, initially unflagged.
2) Flag all the left hand sides of ε-productions.
3) Go through the grammar from 1st production to
last. If A→ x1x2…xn is any such production, then
if x1,x2,…,xn are all flagged, then flag A.
4) Were any new symbols flagged in step 3? If so,
go back to step 3, else stop. The flagged symbols
are those which vanish.
Example: Try the algorithm on the following
Grammar.
Grammar G4
A → b Y D | A Y c
Y → E F | ε
D → g h i
F → N O | Y N
N → ε
O → Y N
E → Y O N Y
Defns:
An - production is one of the form A -> .
If A, in this case, is the goal symbol, the production
is referred to as a null goal production
Theorem:
For every cfg G, there exists a cfg G’, such that L(G’) = L(G), and G’ has no -productions
with exception that if L(G), then G’ contains a null goal production.
Proof. G’ can be formed from G as follows:
1. Discard all the -productions.
2. For each production of G, add to the grammar
all possible productions that can be formed
from it by omitting from its rhs some subset
of those symbols (if any) that vanish..
3. Remove all productions with useless symbols.
4. If the goal symbol of G vanishes, add a null goal
production.
Example 1
G -> AVw
A -> aA | a
V -> rUcW | U -> W ->
First of all, determine which symbols vanish: U, V, W.
1) Remove -productions, gives:
G -> AVw A -> aA | a V -> rUcW
2) Considering G -> AVw in step 2 of the
algorithm, we add to the grammar G -> Aw
Considering V -> rUcW, we add
V -> rc V -> rUc V -> rcW
3) W, U are now useless symbols, so leaving
out all productions with W, U, we get:
G -> AVw | Aw A -> aA | a V -> rc
EXAMPLE. Provide a grammar equivalentto the one below but without ε-productionsS → ABaCA → BCB → b | εC → D | εD → d
Try working this out for yourself, before consulting the answer on the next slide. Note carefully that the symbol A is one of those that vanishes.
ANSWER
S → ABaC | ABa | AaC | Aa | BaC |Ba | aC | a
A → BC | B | C
B → b
C → D
D → d
Defn. A unit production of a grammar is one of the form A -> B where A, B are both non-terminals.
Theorem. For any context-free grammar G, a cfg G’ s.t. L(G’) = L(G) and G’ does not contain any unit productions.
Proof. G’ can be formed from G as follows
1. Eliminate -productions from G to form G*
(with possibly a null goal symbol)
2. If A is the left hand side of a unit production and B is any symbol that can be derived
from A, and B -> is any production with B as left hand side where is not a single non-terminal, then add to grammar A -> .
By step 1, any derivation of B from A must
consist entirely of a sequence of non-terminals.
Do step 2 for all symbols which are the left hand side of a unit production
To find all single symbols that can be derived from a symbol A, consider the derivation tree in which no symbol occurs more than once, e.g.:
A
B D E
C F N M
If say M B, we do not include it, as B already occurs in the tree. Hence the depth of the tree
is < = the number of unit productions
3. Now discard all unit productions
EXAMPLEConsider the grammar:
E → E + T | T
T → T * F | F
F → ( E ) | a
Since E => T and T → T * F,
we add to the grammar E → T * F
and since E => F and F → ( E ) | a,
we add E → ( E ) and E → a
Also since T => F, we add T → ( E ) | a
Discarding all unit productions, then gives us:
E → E + T | T * F | ( E ) | a
T → T * F | ( E ) | a
F → ( E ) | a
EXAMPLE 3. “Remove” unit productions from:S → Aa | BB → A | bbA → a | bc | B
ANSWERS → Aa | bb | a | bc since S => B and S => AB → bb | a | bc since B => A A → a | bc | bb since A => B But B is a useless symbol, so discard the production involving B
EXAMPLE 4. “Remove” unit productions from
S → Aa | bb | a | bc | B
B → bb | a | bc
A → a | bc | bb
ANSWER
S → Aa | bb | bc | a
B → bb | a | bc
A → a | bc | bb
Again, B is a useless symbol, and so the
productions involving it should be discarded
Defn. A nice context free grammar is one:
a) without useless symbols,
b) without -production except possible for a null
goal production, and
c) without unit productions
Notation. cfg stands for context free grammar,
and ncfg stands for nice context free grammar
Corollary. For every cfg G, a ncfg G’,
such that L(G’) = L(G).
Recommended