Upload
meljun-cortes-mbampa
View
222
Download
0
Embed Size (px)
Citation preview
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
1/22
CSC 3130: Automata theory and formal languages
Andrej Bogdanov
http://www.cse.cuhk.edu.hk/~andrejb/csc3130
Normal forms and parsing
Fall 2008
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
2/22
Testing membership and parsing
Given a grammar
How can we know if a string xis in its language?
If so, can we reconstruct a parse tree forx?
S 0S1 | 1S0S1 | T
T S | e
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
3/22
First attempt
Maybe we can try all possible derivations:
S 0S1 | 1S0S1 | TT S |
x= 00111
S 0S1
1S0S1
T
00S11
01S0S11
0T1
S
10S10S1...
when do we stop?
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
4/22
Problems
How do we know when to stop?
S 0S1 | 1S0S1 | TT S |
x= 00111
S 0S1
1S0S1
00S11
01S0S11
0T1
10S10S1...
when do we stop?
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
5/22
Problems
Idea: Stop derivation when length exceeds |x|
Not right because of-productions
We might want to eliminate -productions too
S 0S1 | 1S0S1 | TT S |
x= 01011
S 0S1 01S0S11 01S011 010111 3 7 6 5
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
6/22
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
7/22
Unit productions
A unit production is a production of the form
whereA1 andA2 are both variables
Example
A1 A2
S 0S1 | 1S0S1 | TT S | R |
R 0SR
grammar: unit productions:
S T
R
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
8/22
Removal of unit productions
If there is a cycle of unit productions
delete it and replace everything withA1
Example
A1 A2 ... Ak A1
S 0S1 | 1S0S1 | T
T S | R | R 0SR
S T
R
S 0S1 | 1S0S1
S R | R 0SR
T is replaced by S in the {S, T} cycle
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
9/22
Removal of unit productions
For other unit productions, replace every chain
by productionsA1 ,... , Ak
Example
A1 A2 ... Ak
S R 0SRis replaced by S 0SR, R 0SR
S 0S1 | 1S0S1
| R | R 0SR
S 0S1 | 1S0S1
| 0SR| R 0SR
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
10/22
Removal of-productions
A variable N is nullable if there is a derivation
How to remove -productions (except from S)Find all nullable variables N1, ..., Nk
Fori= 1 to k
For every production of the formA Ni
,
add another productionA
IfNi is a production, remove it
If S is nullable, add the special productionS
N*
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
11/22
Example
Find the nullable variables
S
ACDA a
B
C ED |
D BC | b
E b
B C D
nullable variablesgrammar
Find all nullable variables N1, ..., Nk
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
12/22
Finding nullable variables
To find nullable variables, we work backwards First, mark all variablesA s.t.A as nullable
Then, as long as there are productions of the form
where all ofA1,, Ak are marked as nullable, markA
as nullable
A A1 A
k
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
13/22
Eliminating -productions
S ACDA a
B
C ED |
D
BC | bE b
nullable variables:B, C, D
Fori= 1 to kFor every production of the formA Ni,
add another productionA
IfNi is a production, remove it
D CS AD
D B
D
S AC
S A
C E
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
14/22
Recap
After eliminating -productions and unitproductions, we know that every derivation
doesnt shrink in lengthand doesnt go intocycles
Exception: S We will not use this rule at all, except to check if L
Note
-productions must be eliminated before unit
S a1ak where a1, , ak are terminals*
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
15/22
Example: testing membership
S 0S1 | 1S0S1 | TT S |
x= 00111
S | 01 | 101 | 0S1|10S1 | 1S01 | 1S0S1
S 01, 101
10S1
1S01
1S0S1
10011, strings of length 6
10101, strings of length 6
unit, -prod
eliminate
only strings of length 6
0S1 0011, 01011
00S11
strings of length 6
only strings of length 6
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
16/22
Algorithm 1 for testing membership
We can now use the following algorithm to checkif a string xis in the language ofG
Eliminate all -productions and unit productions
Ifx = and S , accept; else delete S LetX:= S
While some new production P can be applied to X
Apply P to X
IfX= x, accept
If|X| > |x|, backtrack
If no more productions can be applied toX, reject
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
17/22
Practical limitations of Algorithm I
Previous algorithm can be very slow ifxis long
There is a faster algorithm, but it requires that we
do some more transformations on the grammar
G = CFG of the java programming language
x= code for a 200-line java program
algorithm might take about 10200 steps!
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
18/22
Chomsky Normal Form
A grammar is in Chomsky Normal Form if everyproduction (except possiblyS ) is of the type
Conversion to Chomsky Normal Form is easy:
A BC A aor
A BcDEreplaceterminals
with new
variables
A BCDE
C c break upsequenceswith new
variables
A BX1X1 CX2X2 DE
C c
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
19/22
Exercise
Convert this CFG into Chomsky Normal Form:
S |ADDA
A a
C c
D bCb
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
20/22
Algorithm 2 for testing membership
S AB | BC
A BA | a
B CC | b
C AB | a
x= baaba
Idea: We generate each substring ofxbottom up
ab b aa
ACB B ACACBSA SASC
B B
SAC
SAC
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
21/22
Parse tree reconstruction
S AB | BC
A BA | a
B CC | b
C AB | a
x= baabaab b aa
ACB B ACACBSA SASC
B B
SAC
SAC
Tracing back the derivations, we obtain the parse tree
7/31/2019 MELJUN CORTES -- AUTOMATA THEORY LECTURE - 9
22/22
Cocke-Younger-Kasami algorithm
Fori= 1 to k
If there is a productionA xiPutA in table cell ii
Forb= 2 to kFors= 1 to kb+ 1
Set t= s+ b
Forj= sto t
If there is a productionA BC
where B is in cell sjand C is in celljt
PutA in cell st
x1 x2 xk
11 22 kk
12 23
1k
s j t k1
b
Input: GrammarG in CNF, string x = x1xk
Cell ijremembers all possible derivations of substring xixj