Mohammad Ashiqur Rahman
Department of Computer ScienceCollege of Engineering
Tennessee Tech University
Normal Forms for Context-Free Grammars
Foundations of Computer ScienceSpring 2017
CYK Algorithm
The CYK Algorithm
Let G = (V, , P, S) be a CFG.
Find that if w L(G)?
The CYK algorithm answers this question.
J. Cocke
D. Younger,
T. Kasami
Independently developed an algorithm to answer this question.
2
The CYK Algorithm (2)
The rules are structured in a Chomsky normal form grammar
Uses a “dynamic programming” A bottom-up approach
Revisiting Chomsky normal form
A CFG G = (V, , P, S) is in Chomsky normal form if each rule in G has one of the following forms:
i. A BC
ii. A a
iii. S where A, B, C, S V, and B, C V {S}, and a
3
Idea of the CYK Algorithm Let u = x1x2 … xn be a string to be tested for u L(G). Let xi,j denote the substring xixi+1 … xj of u.
xi,i is simply xi
The strategy of the CYK algorithm for the grammar G: Step 1: For each substring xi,i of u with length one, find the set Xi,i
of all variables A with a rule A xi,i. Step 2: For each substring xi,i+1 of u with length two, find the set
Xi,i+1 of all variables that initiate A * xi,i+1. Step 3: For each substring xi,i+2 of u with length three, find the set
Xi,i+2 of all variables that initiate A * xi,i+2.…………
Step n – 1: For the substrings x1,n1 and x2,n of u with length n–1, find the sets X1,n1 and X2,n of all variables that initiate A * x1,n1 and A * x2,n, respectively.
Step n: For the string x1,n = u with length n, find the set X1,n of all variables that initiate A * x1,n.
If S X1,n, then u L(G).4
CYK: Matrix Representation
The sets Xi,j can be represented as the upper triangular portion of an n × n matrix:
1 2 3 … n 1 n
1 X1,1 X1,2 X1,3 … X1,n1 X1,n
2 X2,2 X2,3 … X2,n1 X2,n
3 X3,3 … X3,n1 X3,n
⁞ .
…
⁞
n 1 Xn1,n1 Xn1,n
n Xn,n
5
How the CYK Algorithm Works?
Why it works?
Step 1 is straight-forward
For step 2, the derivation of xi,i+1:
For step t, the derivation of xi,i+t:
A derives xi,i+t only if there is rule A BC and i k < i + t such that B Xi,k and C Xk+1, i+t.
A BC
xiC
xixi+1
A BC
* xi,kC
* xi,kxk+1,i+t
6
CYK: Example
Is aaabbb an string in L(G)? S AT | AB
T XB
X AT | AB
A a
B b
1 2 3 4 5 6
1 X1,1 X1,2 X1,3 X1,4 X1,5 X1,6
2 X2,2 X2,3 X2,4 X2,5 X2,6
3 X3,3 X3,4 X3,5 X3,6
4 X4,4 X4,5 X4,6
5 X5,5 X5,6
6 X6,6
7
CYK: Example
Is aaabbb an string in L(G)? S AT | AB
T XB
X AT | AB
A a
B b
1 2 3 4 5 6
1 {A} {S, X}
2 {A} {S, X} {T}
3 {A} {S, X} {T}
4 {B}
5 {B}
6 {B}
8
The CYK Algorithm for Parsing
The CYK algorithm can be used to produce derivations of strings in L(G).
Derivation Sets
S AT A X1,1, T X2,6
aT T X2,6
aXB X X2,5, B X6,6
aATB A X2,2, T X3,5, B X6,6
aaTB T X3,5, B X6,6
aaXBB T X3,4, B X5,5, B X6,6
aaABBB A X3,3, B X4,4, B X5,5, B X6,6
* aaabbb1 2 3 4 5 6
1 {A} {S, X}
2 {A} {S, X} {T}
3 {A} {S, X} {T}
4 {B}
5 {B}
6 {B}
S AT | AB
T XB
X AT | AB
A a
B b
9
CYK: Another Example
Is abba an string in L(G)? S AX | AY | a
X AX | a
Y BY | a
A a
B b
1 2 3 4
1 X1,1 X1,2 X1,3 X1,4
2 X2,2 X2,3 X2,4
3 X3,3 X3,4
4 X4,4
10
CYK: Another Example (2)
Is abba an string in L(G)? S AX | AY | a
X AX | a
Y BY | a
A a
B b
1 2 3 4
1 {S, X, Y, A} {S}
2 {B} {Y}
3 {B} {Y}
4 {S, X, Y, A}
11
CYK: Do It Yourself
Is baaba an string in L(G)?
S AB | BC
A BA | a
B CC | b
C AB | a
1 2 3 4 5
1 X1,1 X1,2 X1,3 X1,4 X1,5
2 X2,2 X2,3 X2,4 X2,5
3 X3,3 X3,4 X3,5
4 X4,4 X4,5
5 X5,5
For each correct slot, you will get 0.5 mark….
Total 7.5 marks.12
Did You Able to Do It Correctly?
For each correct slot, you will get 0.5 mark…. Total 7.5 marks.
baaba L(G)? S AB | BC
A BA | a
B CC | b
C AB | a
1 2 3 4 5
1 {B} {S, A} {S, A, C}
2 {A, C} {B} {B} {S, A, C}
3 {A, C} {S, C} {B}
4 {B} {S, A}
5 {A, C}
13
THANKS
Source:- Chapter 4, Languages and Machines, Thomas Sudkamp
14