Practical problems with Chomsky-Schützenberger parsing for ......Practicalproblems...

Preview:

Citation preview

Practical problemswith Chomsky-Schützenberger parsing

for weighted multiple context-free grammars1

Tobias Denkingertobias.denkinger@tu-dresden.de

Institute of Theoretical Computer ScienceFaculty of Computer Science

Technische Universität Dresden

WATA, Leipzig, 2018-05-23

1based on T. Denkinger (2017). “Chomsky-Schützenberger parsing for weightedmultiple context-free languages”.

The problem: 𝑘-best parsing

𝑘-best

parsing problem

[Huang and Chiang 2005]

Input:a

(𝒜, ⊙, 𝟙, 𝟘)-weighted

grammar

(

𝐺

,wt)a suitable partial order ⊴ on (𝒜, ⊙, 𝟙, 𝟘)a number 𝑘 ∈ ℕ

a word 𝑤

Output:a

sequence of 𝑘 best

derivation

s2

of 𝑤 in 𝐺(not unique)

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 2 / 13

The problem: 𝑘-best parsing

𝑘-best parsing problem [Huang and Chiang 2005]Input:

a (𝒜, ⊙, 𝟙, 𝟘)-weighted grammar (𝐺,wt)a suitable partial order ⊴ on (𝒜, ⊙, 𝟙, 𝟘)a number 𝑘 ∈ ℕa word 𝑤

Output:a sequence of 𝑘 best derivations2 of 𝑤 in 𝐺

(not unique)

2w.r.t. wt and ⊴ (greater is better)T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 2 / 13

Multiple context-free grammars

context-free grammars

𝐴 → a𝐴b𝐵 composes strings

𝐴 → [(𝑥, 𝑦) ↦ a𝑥b𝑦⏟⏟⏟⏟⏟⏟⏟𝛴∗×𝛴∗→𝛴∗

](𝐴, 𝐵)

multiple context-free grammars[Seki, Matsumura, Fujii, and Kasami 1991]

𝐴 → [((𝑥1, 𝑥2), (𝑦1, 𝑦2)) ↦ (a𝑥1𝑦2b, 𝑦1c𝑥2)⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟(𝛴∗×𝛴∗)×(𝛴∗×𝛴∗)→(𝛴∗×𝛴∗)

](𝐴, 𝐵)

composes tuples of strings

⟹ extra expressive power useful for natural language processing

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 3 / 13

Multiple context-free grammars

context-free grammars

𝐴 → a𝐴b𝐵 composes strings

𝐴 → [(𝑥, 𝑦) ↦ a𝑥b𝑦⏟⏟⏟⏟⏟⏟⏟𝛴∗×𝛴∗→𝛴∗

](𝐴, 𝐵)

multiple context-free grammars[Seki, Matsumura, Fujii, and Kasami 1991]

𝐴 → [((𝑥1, 𝑥2), (𝑦1, 𝑦2)) ↦ (a𝑥1𝑦2b, 𝑦1c𝑥2)⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟(𝛴∗×𝛴∗)×(𝛴∗×𝛴∗)→(𝛴∗×𝛴∗)

](𝐴, 𝐵)

composes tuples of strings

⟹ extra expressive power useful for natural language processing

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 3 / 13

Multiple context-free grammars

context-free grammars

𝐴 → a𝐴b𝐵 composes strings

𝐴 → [ a𝑥 b 𝑦 ](𝐴, 𝐵)

multiple context-free grammars[Seki, Matsumura, Fujii, and Kasami 1991]

𝐴 → [((𝑥1, 𝑥2), (𝑦1, 𝑦2)) ↦ (a𝑥1𝑦2b, 𝑦1c𝑥2)⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟(𝛴∗×𝛴∗)×(𝛴∗×𝛴∗)→(𝛴∗×𝛴∗)

](𝐴, 𝐵)

composes tuples of strings

⟹ extra expressive power useful for natural language processing

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 3 / 13

Multiple context-free grammars

context-free grammars

𝐴 → a𝐴b𝐵 composes strings

𝐴 → [ a𝑥 b 𝑦 ](𝐴, 𝐵)

multiple context-free grammars[Seki, Matsumura, Fujii, and Kasami 1991]

𝐴 → [((𝑥1, 𝑥2), (𝑦1, 𝑦2)) ↦ (a𝑥1𝑦2b, 𝑦1c𝑥2)⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟(𝛴∗×𝛴∗)×(𝛴∗×𝛴∗)→(𝛴∗×𝛴∗)

](𝐴, 𝐵)

composes tuples of strings

⟹ extra expressive power useful for natural language processing

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 3 / 13

Multiple context-free grammars

context-free grammars

𝐴 → a𝐴b𝐵 composes strings

𝐴 → [ a𝑥 b 𝑦 ](𝐴, 𝐵)

multiple context-free grammars[Seki, Matsumura, Fujii, and Kasami 1991]

𝐴 → [ a𝑥1𝑦2b, 𝑦1c𝑥2 ](𝐴, 𝐵)

composes tuples of strings

⟹ extra expressive power useful for natural language processing

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 3 / 13

Multiple context-free grammars

context-free grammars

𝐴 → a𝐴b𝐵 composes strings

𝐴 → [ a𝑥 b 𝑦 ](𝐴, 𝐵)

multiple context-free grammars[Seki, Matsumura, Fujii, and Kasami 1991]

𝐴 → [ a𝑥1𝑦2b, 𝑦1c𝑥2 ](𝐴, 𝐵)

composes tuples of strings

⟹ extra expressive power useful for natural language processing

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 3 / 13

The Chomsky-Schützenberger theorem

CS-theorems [Chomsky and Schützenberger 1963]

[Yoshinaka, Kaji, and Seki 2010]

Let 𝐿 be a language. T.f.a.e.

1. ∃

M

CFG 𝐺 s.t. 𝐿 = L(𝐺)2. ∃ regular language 𝑅,

multiple

Dyck language 𝐷,∃ homomorphism ℎs.t. 𝐿 = ℎ(𝑅 ∩ 𝐷)

Idea [Hulden 2011, for CFGs]Use the decomposition provided by (1. → 2.) for parsing.

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 4 / 13

The Chomsky-Schützenberger theorem

CS-theorems [Chomsky and Schützenberger 1963][Yoshinaka, Kaji, and Seki 2010]

Let 𝐿 be a language. T.f.a.e.

1. ∃ MCFG 𝐺 s.t. 𝐿 = L(𝐺)2. ∃ regular language 𝑅,

∃ multiple Dyck language 𝐷,∃ homomorphism ℎs.t. 𝐿 = ℎ(𝑅 ∩ 𝐷)

Idea [Hulden 2011, for CFGs]Use the decomposition provided by (1. → 2.) for parsing.

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 4 / 13

The Chomsky-Schützenberger theorem

CS-theorems [Chomsky and Schützenberger 1963][Yoshinaka, Kaji, and Seki 2010]

Let 𝐿 be a language. T.f.a.e.

1. ∃ MCFG 𝐺 s.t. 𝐿 = L(𝐺)2. ∃ regular language 𝑅,

∃ multiple Dyck language 𝐷,∃ homomorphism ℎs.t. 𝐿 = ℎ(𝑅 ∩ 𝐷)

Idea [Hulden 2011, for CFGs]Use the decomposition provided by (1. → 2.) for parsing.

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 4 / 13

From the CS-theorem to CS-parsing

𝑤 ∈ L(𝐺)

⟺ 𝑤 ∈ ℎ(𝑅 ∩ 𝐷) (CS-theorem)

⟺ ∃𝑢 ∈ 𝑅 ∩ 𝐷: ℎ(𝑢) = 𝑤⟺ ∃𝑢 ∈ 𝑅 ∩ ℎ−1(𝑤): 𝑢 ∈ 𝐷

ObservationEach 𝑢 ∈ 𝑅 ∩ 𝐷 encodes a derivation of 𝐺.

𝑘-best CS-parsing

parse𝐺,wt,𝑘(𝑤)

= (take𝑘 ∘ sortwt⊴ ∘ toDeriv ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ sortwt⊴ ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sortwt⊴ )(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 5 / 13

From the CS-theorem to CS-parsing

𝑤 ∈ L(𝐺) ⟺ 𝑤 ∈ ℎ(𝑅 ∩ 𝐷) (CS-theorem)

⟺ ∃𝑢 ∈ 𝑅 ∩ 𝐷: ℎ(𝑢) = 𝑤⟺ ∃𝑢 ∈ 𝑅 ∩ ℎ−1(𝑤): 𝑢 ∈ 𝐷

ObservationEach 𝑢 ∈ 𝑅 ∩ 𝐷 encodes a derivation of 𝐺.

𝑘-best CS-parsing

parse𝐺,wt,𝑘(𝑤)

= (take𝑘 ∘ sortwt⊴ ∘ toDeriv ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ sortwt⊴ ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sortwt⊴ )(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 5 / 13

From the CS-theorem to CS-parsing

𝑤 ∈ L(𝐺) ⟺ 𝑤 ∈ ℎ(𝑅 ∩ 𝐷) (CS-theorem)

⟺ ∃𝑢 ∈ 𝑅 ∩ 𝐷: ℎ(𝑢) = 𝑤

⟺ ∃𝑢 ∈ 𝑅 ∩ ℎ−1(𝑤): 𝑢 ∈ 𝐷

ObservationEach 𝑢 ∈ 𝑅 ∩ 𝐷 encodes a derivation of 𝐺.

𝑘-best CS-parsing

parse𝐺,wt,𝑘(𝑤)

= (take𝑘 ∘ sortwt⊴ ∘ toDeriv ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ sortwt⊴ ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sortwt⊴ )(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 5 / 13

From the CS-theorem to CS-parsing

𝑤 ∈ L(𝐺) ⟺ 𝑤 ∈ ℎ(𝑅 ∩ 𝐷) (CS-theorem)

⟺ ∃𝑢 ∈ 𝑅 ∩ 𝐷: ℎ(𝑢) = 𝑤⟺ ∃𝑢 ∈ 𝑅 ∩ ℎ−1(𝑤): 𝑢 ∈ 𝐷

ObservationEach 𝑢 ∈ 𝑅 ∩ 𝐷 encodes a derivation of 𝐺.

𝑘-best CS-parsing

parse𝐺,wt,𝑘(𝑤)

= (take𝑘 ∘ sortwt⊴ ∘ toDeriv ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ sortwt⊴ ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sortwt⊴ )(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 5 / 13

From the CS-theorem to CS-parsing

𝑤 ∈ L(𝐺) ⟺ 𝑤 ∈ ℎ(𝑅 ∩ 𝐷) (CS-theorem)

⟺ ∃𝑢 ∈ 𝑅 ∩ 𝐷: ℎ(𝑢) = 𝑤⟺ ∃𝑢 ∈ 𝑅 ∩ ℎ−1(𝑤): 𝑢 ∈ 𝐷

ObservationEach 𝑢 ∈ 𝑅 ∩ 𝐷 encodes a derivation of 𝐺.

𝑘-best CS-parsing

parse𝐺,wt,𝑘(𝑤)

= (take𝑘 ∘ sortwt⊴ ∘ toDeriv ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ sortwt⊴ ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sortwt⊴ )(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 5 / 13

From the CS-theorem to CS-parsing

𝑤 ∈ L(𝐺) ⟺ 𝑤 ∈ ℎ(𝑅 ∩ 𝐷) (CS-theorem)

⟺ ∃𝑢 ∈ 𝑅 ∩ 𝐷: ℎ(𝑢) = 𝑤⟺ ∃𝑢 ∈ 𝑅 ∩ ℎ−1(𝑤): 𝑢 ∈ 𝐷

ObservationEach 𝑢 ∈ 𝑅 ∩ 𝐷 encodes a derivation of 𝐺.

𝑘-best CS-parsing

parse𝐺,wt,𝑘(𝑤)

= (take𝑘 ∘ sortwt⊴ ∘ toDeriv ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ sortwt⊴ ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sortwt⊴ )(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 5 / 13

From the CS-theorem to CS-parsing

𝑤 ∈ L(𝐺) ⟺ 𝑤 ∈ ℎ(𝑅 ∩ 𝐷) (CS-theorem)

⟺ ∃𝑢 ∈ 𝑅 ∩ 𝐷: ℎ(𝑢) = 𝑤⟺ ∃𝑢 ∈ 𝑅 ∩ ℎ−1(𝑤): 𝑢 ∈ 𝐷

ObservationEach 𝑢 ∈ 𝑅 ∩ 𝐷 encodes a derivation of 𝐺.

𝑘-best CS-parsing

parse𝐺,wt,𝑘(𝑤)= (take𝑘 ∘ sortwt⊴ ∘ toDeriv ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))

= (toDeriv ∘ take𝑘 ∘ sortwt⊴ ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sortwt⊴ )(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 5 / 13

From the CS-theorem to CS-parsing

𝑤 ∈ L(𝐺) ⟺ 𝑤 ∈ ℎ(𝑅 ∩ 𝐷) (CS-theorem)

⟺ ∃𝑢 ∈ 𝑅 ∩ 𝐷: ℎ(𝑢) = 𝑤⟺ ∃𝑢 ∈ 𝑅 ∩ ℎ−1(𝑤): 𝑢 ∈ 𝐷

ObservationEach 𝑢 ∈ 𝑅 ∩ 𝐷 encodes a derivation of 𝐺.

𝑘-best CS-parsing

parse𝐺,wt,𝑘(𝑤)= (take𝑘 ∘ sortwt⊴ ∘ toDeriv ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ sortwt⊴ ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))

= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sortwt⊴ )(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 5 / 13

From the CS-theorem to CS-parsing

𝑤 ∈ L(𝐺) ⟺ 𝑤 ∈ ℎ(𝑅 ∩ 𝐷) (CS-theorem)

⟺ ∃𝑢 ∈ 𝑅 ∩ 𝐷: ℎ(𝑢) = 𝑤⟺ ∃𝑢 ∈ 𝑅 ∩ ℎ−1(𝑤): 𝑢 ∈ 𝐷

ObservationEach 𝑢 ∈ 𝑅 ∩ 𝐷 encodes a derivation of 𝐺.

𝑘-best CS-parsing

parse𝐺,wt,𝑘(𝑤)= (take𝑘 ∘ sortwt⊴ ∘ toDeriv ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ sortwt⊴ ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sortwt⊴ )(𝑅 ∩ ℎ−1(𝑤))

= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 5 / 13

From the CS-theorem to CS-parsing

𝑤 ∈ L(𝐺) ⟺ 𝑤 ∈ ℎ(𝑅 ∩ 𝐷) (CS-theorem)

⟺ ∃𝑢 ∈ 𝑅 ∩ 𝐷: ℎ(𝑢) = 𝑤⟺ ∃𝑢 ∈ 𝑅 ∩ ℎ−1(𝑤): 𝑢 ∈ 𝐷

ObservationEach 𝑢 ∈ 𝑅 ∩ 𝐷 encodes a derivation of 𝐺.

𝑘-best CS-parsing

parse𝐺,wt,𝑘(𝑤)= (take𝑘 ∘ sortwt⊴ ∘ toDeriv ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ sortwt⊴ ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sortwt⊴ )(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

enumerate from a weighted finite-state automaton

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 5 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

an MCFG 𝐺:

𝛼: 𝑆 → [𝑥1𝑥2](𝐴)𝛽: 𝐴 → [a𝑥1b, c𝑥2](𝐴)𝛾: 𝐴 → [𝜀, 𝜀]()

L(𝐺) = {a𝑛b𝑛c𝑛 ∣ 𝑛 ∈ ℕ}

word 𝑤 =

aabbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

an MCFG 𝐺:

𝛼: 𝑆 → [𝑥1𝑥2](𝐴)𝛽: 𝐴 → [a𝑥1b, c𝑥2](𝐴)𝛾: 𝐴 → [𝜀, 𝜀]()

L(𝐺) = {a𝑛b𝑛c𝑛 ∣ 𝑛 ∈ ℕ}

word 𝑤 =

aabbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

an MCFG 𝐺:

𝛼: 𝑆 → [𝑥1𝑥2](𝐴)𝛽: 𝐴 → [a𝑥1b, c𝑥2](𝐴)𝛾: 𝐴 → [𝜀, 𝜀]()

L(𝐺) = {a𝑛b𝑛c𝑛 ∣ 𝑛 ∈ ℕ}

word 𝑤 =

aabbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝑆1

start

𝐴1

𝜀

a

𝐴1

𝜀

b 𝐴2

𝜀

c

𝐴2

𝜀

𝜀

𝑆1

𝜀

word 𝑤 =

aabbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝑆1

start

𝐴1

𝜀

a

𝐴1

𝜀

b 𝐴2

𝜀

c

𝐴2

𝜀

𝜀

𝑆1

𝜀

word 𝑤 =

aabbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝑆1

start

𝐴1

𝜀

a

𝐴1

𝜀

b 𝐴2

𝜀

c

𝐴2

𝜀

𝜀

𝑆1

𝜀

word 𝑤 =

aabbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝑆1

start

𝐴1

𝜀

a

𝐴1

𝜀

b 𝐴2

𝜀

c

𝐴2

𝜀

𝜀

𝑆1

𝜀

word 𝑤 =

aabbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝑆1

start

𝐴1

𝜀

a

𝐴1

𝜀

b 𝐴2

𝜀

c

𝐴2

𝜀

𝜀

𝑆1

𝜀

word 𝑤 =

aabbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝑆1

start

𝐴1

𝜀

a

𝐴1

𝜀

b 𝐴2

𝜀

c

𝐴2

𝜀

𝜀

𝑆1

𝜀

word 𝑤 = a

abbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝑆1

start

𝐴1

𝜀

a

𝐴1

𝜀

b 𝐴2

𝜀

c

𝐴2

𝜀

𝜀

𝑆1

𝜀

word 𝑤 = aa

bbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝑆1

start

𝐴1

𝜀

a

𝐴1

𝜀

b

𝐴2

𝜀

c

𝐴2

𝜀

𝜀

𝑆1

𝜀

word 𝑤 = aa

bbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝑆1

start

𝐴1

𝜀

a

𝐴1

𝜀

b

𝐴2

𝜀

c

𝐴2

𝜀

𝜀

𝑆1

𝜀

word 𝑤 = aa

bbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝑆1

start

𝐴1

𝜀

a

𝐴1

𝜀

b

𝐴2

𝜀

c

𝐴2

𝜀

𝜀

𝑆1

𝜀

word 𝑤 = aa

bbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝑆1

start

𝐴1

𝜀

a

𝐴1

𝜀

b

𝐴2

𝜀

c

𝐴2

𝜀

𝜀

𝑆1

𝜀

word 𝑤 = aab

bcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝑆1

start

𝐴1

𝜀

a

𝐴1

𝜀

b 𝐴2

𝜀

c

𝐴2

𝜀

𝜀

𝑆1

𝜀

word 𝑤 = aabbcc

𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝑆1

start

𝐴1

𝜀

a

𝐴1

𝜀

b 𝐴2

𝜀

c

𝐴2

𝜀

𝜀

𝑆1

𝜀

word 𝑤 = aabbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

Derivations and bracket words

𝛼: 𝑆 → [ 𝑥1 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛽: 𝐴 → [ a 𝑥1 𝑏 , 𝑐 𝑥2 ](𝐴)

𝛾: 𝐴 → [ 𝜀 , 𝜀 ]()

𝑆1 𝑆1

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝐴1 𝐴1 𝐴2 𝐴2

𝑆1

start

𝐴1

[1𝛼[1𝛼,1

[1𝛽a[1𝛽,1

𝐴1

[1𝛾]1𝛾]1𝛽,1b]1𝛽 𝐴2

]1𝛼,1[1𝛼,2

[2𝛽c[2𝛽,1

𝐴2

[2𝛾]2𝛾

]2𝛽,1]2𝛽

𝑆1

]1𝛼,2]1𝛼

word 𝑤 = aabbcc 𝑤′ = aaabbcc ∉ L(𝐺)

word 𝑢 = [1𝛼[1𝛼,1 [1𝛽a[1𝛽,1 [1𝛽a[1𝛽,1 [1𝛾]1𝛾 ]1𝛽,1b]1𝛽 ]1𝛽,1b]1𝛽 ]1𝛼,1[1𝛼,2 [2𝛽c[2𝛽,1 [2𝛽c[2𝛽,1 [2𝛾]2𝛾 ]2𝛽,1]2𝛽 ]2𝛽,1]2𝛽 ]1𝛼,2]1𝛼

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 6 / 13

From the CS-theorem to CS-parsing

𝑤 ∈ L(𝐺) ⟺ 𝑤 ∈ ℎ(𝑅 ∩ 𝐷) (CS-theorem)

⟺ ∃𝑢 ∈ 𝑅 ∩ 𝐷: ℎ(𝑢) = 𝑤⟺ ∃𝑢 ∈ 𝑅 ∩ ℎ−1(𝑤): 𝑢 ∈ 𝐷

ObservationEach 𝑢 ∈ 𝑅 ∩ 𝐷 encodes a derivation of 𝐺.

𝑘-best CS-parsing

parse𝐺,wt,𝑘(𝑤)= (take𝑘 ∘ sortwt⊴ ∘ toDeriv ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ sortwt⊴ ∘ filter∩𝐷)(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sortwt⊴ )(𝑅 ∩ ℎ−1(𝑤))= (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

enumerate from a weighted finite-state automaton

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 7 / 13

Practical problems… with the weighted finite state automaton 𝑅wt

𝑆1

start

𝐴1

[1𝛼[1𝛼,1

[1𝛽a[1𝛽,1

𝐴1

[1𝛾]1𝛾

]1𝛽,1b]1𝛽 𝐴2

]1𝛼,1[1𝛼,2

[2𝛽c[2𝛽,1

𝐴2

[2𝛾]2𝛾

]2𝛽,1]2𝛽

𝑆1

]1𝛼,2]1𝛼

enumerate 𝑅wt

by ascending weight

Dijkstra-like algorithm

initial idea:

attach weights to[1𝜎-brackets

problem:

loops with weight 𝟙

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 8 / 13

Practical problems… with the weighted finite state automaton 𝑅wt

𝑆1

start

𝐴1

[1𝛼[1𝛼,1

[1𝛽a[1𝛽,1

𝐴1

[1𝛾]1𝛾

]1𝛽,1b]1𝛽 𝐴2

]1𝛼,1[1𝛼,2

[2𝛽c[2𝛽,1

𝐴2

[2𝛾]2𝛾

]2𝛽,1]2𝛽

𝑆1

]1𝛼,2]1𝛼

enumerate 𝑅wt

by ascending weight

Dijkstra-like algorithm

initial idea:

attach weights to[1𝜎-brackets

problem:

loops with weight 𝟙

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 8 / 13

Practical problems… with the weighted finite state automaton 𝑅wt

𝑆1

start

𝐴1

[1𝛼[1𝛼,1

[1𝛽a[1𝛽,1

𝐴1

[1𝛾]1𝛾

]1𝛽,1b]1𝛽 𝐴2

]1𝛼,1[1𝛼,2

[2𝛽c[2𝛽,1

𝐴2

[2𝛾]2𝛾

]2𝛽,1]2𝛽

𝑆1

]1𝛼,2]1𝛼

enumerate 𝑅wt

by ascending weight

Dijkstra-like algorithm

initial idea:

attach weights to[1𝜎-brackets

problem:

loops with weight 𝟙

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 8 / 13

Practical problems… with the weighted finite state automaton 𝑅wt

𝑆1

start

𝐴1

[1𝛼[1𝛼,1/wt𝛼

[1𝛽a[1𝛽,1/wt𝛽

𝐴1

[1𝛾]1𝛾/wt𝛾

]1𝛽,1b]1𝛽/1 𝐴2

]1𝛼,1[1𝛼,2/1

[2𝛽c[2𝛽,1/1

𝐴2

[2𝛾]2𝛾/1

]2𝛽,1]2𝛽/1

𝑆1

]1𝛼,2]1𝛼/1

enumerate 𝑅wt

by ascending weight

Dijkstra-like algorithm

initial idea:

attach weights to[1𝜎-brackets

problem:

loops with weight 𝟙

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 8 / 13

Practical problems… with the weighted finite state automaton 𝑅wt

𝑆1

start

𝐴1

[1𝛼[1𝛼,1/wt𝛼

[1𝛽a[1𝛽,1/wt𝛽

𝐴1

[1𝛾]1𝛾/wt𝛾

]1𝛽,1b]1𝛽/1 𝐴2

]1𝛼,1[1𝛼,2/1

[2𝛽c[2𝛽,1/1

𝐴2

[2𝛾]2𝛾/1

]2𝛽,1]2𝛽/1

𝑆1

]1𝛼,2]1𝛼/1

enumerate 𝑅wt

by ascending weight

Dijkstra-like algorithm

initial idea:

attach weights to[1𝜎-brackets

problem:

loops with weight 𝟙

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 8 / 13

Solutions and workarounds I

… with the weighted finite state automaton 𝑅wt

𝑆1

start

𝐴1

[1𝛼[1𝛼,1/wt𝛼

[1𝛽a[1𝛽,1/wt𝛽

𝐴1

[1𝛾]1𝛾/wt𝛾

]1𝛽,1b]1𝛽/1 𝐴2

]1𝛼,1[1𝛼,2/1

[2𝛽c[2𝛽,1/1

𝐴2

[2𝛾]2𝛾/1

]2𝛽,1]2𝛽/1

𝑆1

]1𝛼,2]1𝛼/1

assume thatwt𝜎 ≠ 𝟙 in loops

assume thatweights can befactorised

distribute factorsof wt𝜎 amongtransitions with[1𝜎, ]1𝜎, [2𝜎, ]2𝜎, …

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 9 / 13

Solutions and workarounds I

… with the weighted finite state automaton 𝑅wt

𝑆1

start

𝐴1

[1𝛼[1𝛼,1/wt𝛼

[1𝛽a[1𝛽,1/wt𝛽

𝐴1

[1𝛾]1𝛾/wt𝛾

]1𝛽,1b]1𝛽/1 𝐴2

]1𝛼,1[1𝛼,2/1

[2𝛽c[2𝛽,1/1

𝐴2

[2𝛾]2𝛾/1

]2𝛽,1]2𝛽/1

𝑆1

]1𝛼,2]1𝛼/1

assume thatwt𝜎 ≠ 𝟙 in loops

assume thatweights can befactorised

distribute factorsof wt𝜎 amongtransitions with[1𝜎, ]1𝜎, [2𝜎, ]2𝜎, …

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 9 / 13

Solutions and workarounds I

… with the weighted finite state automaton 𝑅wt

𝑆1

start

𝐴1

[1𝛼[1𝛼,1/wt𝛼

[1𝛽a[1𝛽,1/wt𝛽

𝐴1

[1𝛾]1𝛾/wt𝛾

]1𝛽,1b]1𝛽/1 𝐴2

]1𝛼,1[1𝛼,2/1

[2𝛽c[2𝛽,1/1

𝐴2

[2𝛾]2𝛾/1

]2𝛽,1]2𝛽/1

𝑆1

]1𝛼,2]1𝛼/1

assume thatwt𝜎 ≠ 𝟙 in loops

assume thatweights can befactorised

distribute factorsof wt𝜎 amongtransitions with[1𝜎, ]1𝜎, [2𝜎, ]2𝜎, …

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 9 / 13

Solutions and workarounds I

… with the weighted finite state automaton 𝑅wt

𝑆1

start

𝐴1

[1𝛼[1𝛼,1/ 2√wt𝛼

[1𝛽a[1𝛽,1/ 4√wt𝛽

𝐴1

[1𝛾]1𝛾/ 2√wt𝛾

]1𝛽,1b]1𝛽/ 4√wt𝛽 𝐴2

]1𝛼,1[1𝛼,2/1

[2𝛽c[2𝛽,1/ 4√wt𝛽

𝐴2

[2𝛾]2𝛾/ 2√wt𝛾

]2𝛽,1]2𝛽/ 4√wt𝛽

𝑆1

]1𝛼,2]1𝛼/ 2√wt𝛼

assume thatwt𝜎 ≠ 𝟙 in loops

assume thatweights can befactorised

distribute factorsof wt𝜎 amongtransitions with[1𝜎, ]1𝜎, [2𝜎, ]2𝜎, …

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 9 / 13

Solutions and workarounds II

Assumption 1: weights wt𝜎 that occur in loops are ≠ 𝟙

⟹ restrict the grammar

restricted weighted MCFGs:may not have derivations of the form

𝛼1𝛼2⋮𝛼𝑘

𝛼1⋮

with wt𝛼1= … = wt𝛼𝑘

= 𝟙

useful probabilistic MCFGs are al-ways restricted

𝔹-weighted MCFGs can be trans-formed to restricted ℕ-weightedMCFGs with the same support

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 10 / 13

Solutions and workarounds II

Assumption 1: weights wt𝜎 that occur in loops are ≠ 𝟙⟹ restrict the grammar

restricted weighted MCFGs:may not have derivations of the form

𝛼1𝛼2⋮𝛼𝑘

𝛼1⋮

with wt𝛼1= … = wt𝛼𝑘

= 𝟙

useful probabilistic MCFGs are al-ways restricted

𝔹-weighted MCFGs can be trans-formed to restricted ℕ-weightedMCFGs with the same support

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 10 / 13

Solutions and workarounds II

Assumption 1: weights wt𝜎 that occur in loops are ≠ 𝟙⟹ restrict the grammar

restricted weighted MCFGs:may not have derivations of the form

𝛼1𝛼2⋮𝛼𝑘

𝛼1⋮

with wt𝛼1= … = wt𝛼𝑘

= 𝟙

useful probabilistic MCFGs are al-ways restricted

𝔹-weighted MCFGs can be trans-formed to restricted ℕ-weightedMCFGs with the same support

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 10 / 13

Solutions and workarounds II

Assumption 1: weights wt𝜎 that occur in loops are ≠ 𝟙⟹ restrict the grammar

restricted weighted MCFGs:may not have derivations of the form

𝛼1𝛼2⋮𝛼𝑘

𝛼1⋮

with wt𝛼1= … = wt𝛼𝑘

= 𝟙

useful probabilistic MCFGs are al-ways restricted

𝔹-weighted MCFGs can be trans-formed to restricted ℕ-weightedMCFGs with the same support

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 10 / 13

Solutions and workarounds II

Assumption 1: weights wt𝜎 that occur in loops are ≠ 𝟙⟹ restrict the grammar

restricted weighted MCFGs:may not have derivations of the form

𝛼1𝛼2⋮𝛼𝑘

𝛼1⋮

with wt𝛼1= … = wt𝛼𝑘

= 𝟙

useful probabilistic MCFGs are al-ways restricted

𝔹-weighted MCFGs can be trans-formed to restricted ℕ-weightedMCFGs with the same support

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 10 / 13

Solutions and workarounds III

Assumption 2: weights can be factorised

⟹ restrict weight algebra

factorisable (multiplicative) monoid with zero (𝒜, ⊙, 𝟙, 𝟘)∀𝑎 ∈ 𝒜 ∖ {𝟘, 𝟙}: ∃𝑎1, 𝑎2 ∈ 𝒜 ∖ {𝟙}: 𝑎1 ⊙ 𝑎2 = 𝑎

two examples from nlp:

(𝒜, ⊙, 𝟙, 𝟘) factorisation

([0, 1], ⋅, 1, 0) 𝑎 = 2√

𝑎 ⋅ 2√

𝑎

(ℝ−∞≥0 , +, 0, −∞) 𝑎 = 𝑎/2 + 𝑎/2

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 11 / 13

Solutions and workarounds III

Assumption 2: weights can be factorised⟹ restrict weight algebra

factorisable (multiplicative) monoid with zero (𝒜, ⊙, 𝟙, 𝟘)∀𝑎 ∈ 𝒜 ∖ {𝟘, 𝟙}: ∃𝑎1, 𝑎2 ∈ 𝒜 ∖ {𝟙}: 𝑎1 ⊙ 𝑎2 = 𝑎

two examples from nlp:

(𝒜, ⊙, 𝟙, 𝟘) factorisation

([0, 1], ⋅, 1, 0) 𝑎 = 2√

𝑎 ⋅ 2√

𝑎

(ℝ−∞≥0 , +, 0, −∞) 𝑎 = 𝑎/2 + 𝑎/2

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 11 / 13

Solutions and workarounds III

Assumption 2: weights can be factorised⟹ restrict weight algebra

factorisable (multiplicative) monoid with zero (𝒜, ⊙, 𝟙, 𝟘)∀𝑎 ∈ 𝒜 ∖ {𝟘, 𝟙}: ∃𝑎1, 𝑎2 ∈ 𝒜 ∖ {𝟙}: 𝑎1 ⊙ 𝑎2 = 𝑎

two examples from nlp:

(𝒜, ⊙, 𝟙, 𝟘) factorisation

([0, 1], ⋅, 1, 0) 𝑎 = 2√

𝑎 ⋅ 2√

𝑎

(ℝ−∞≥0 , +, 0, −∞) 𝑎 = 𝑎/2 + 𝑎/2

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 11 / 13

Solutions and workarounds III

Assumption 2: weights can be factorised⟹ restrict weight algebra

factorisable (multiplicative) monoid with zero (𝒜, ⊙, 𝟙, 𝟘)∀𝑎 ∈ 𝒜 ∖ {𝟘, 𝟙}: ∃𝑎1, 𝑎2 ∈ 𝒜 ∖ {𝟙}: 𝑎1 ⊙ 𝑎2 = 𝑎

two examples from nlp:

(𝒜, ⊙, 𝟙, 𝟘) factorisation

([0, 1], ⋅, 1, 0) 𝑎 = 2√

𝑎 ⋅ 2√

𝑎

(ℝ−∞≥0 , +, 0, −∞) 𝑎 = 𝑎/2 + 𝑎/2

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 11 / 13

Conclusion and outlook

Theorem (𝑘-best parsing)Let (𝐺,wt) be a restricted weighted MCFG over a factorisablemonoid with zero and ⊴ be a suitable partial order on the monoid.Then (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

solves the 𝑘-best parsing problem for (𝐺,wt) and a word 𝑤.

ConjectureThe restrictions are not problematic in practice.

refined and implemented by T. Ruprecht (in his master thesis)

he currently investigates practical viability

Thank you for your attention.

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 12 / 13

Conclusion and outlook

Theorem (𝑘-best parsing)Let (𝐺,wt) be a restricted weighted MCFG over a factorisablemonoid with zero and ⊴ be a suitable partial order on the monoid.Then (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

solves the 𝑘-best parsing problem for (𝐺,wt) and a word 𝑤.

ConjectureThe restrictions are not problematic in practice.

refined and implemented by T. Ruprecht (in his master thesis)

he currently investigates practical viability

Thank you for your attention.

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 12 / 13

Conclusion and outlook

Theorem (𝑘-best parsing)Let (𝐺,wt) be a restricted weighted MCFG over a factorisablemonoid with zero and ⊴ be a suitable partial order on the monoid.Then (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

solves the 𝑘-best parsing problem for (𝐺,wt) and a word 𝑤.

ConjectureThe restrictions are not problematic in practice.

refined and implemented by T. Ruprecht (in his master thesis)

he currently investigates practical viability

Thank you for your attention.

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 12 / 13

Conclusion and outlook

Theorem (𝑘-best parsing)Let (𝐺,wt) be a restricted weighted MCFG over a factorisablemonoid with zero and ⊴ be a suitable partial order on the monoid.Then (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

solves the 𝑘-best parsing problem for (𝐺,wt) and a word 𝑤.

ConjectureThe restrictions are not problematic in practice.

refined and implemented by T. Ruprecht (in his master thesis)

he currently investigates practical viability

Thank you for your attention.

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 12 / 13

Conclusion and outlook

Theorem (𝑘-best parsing)Let (𝐺,wt) be a restricted weighted MCFG over a factorisablemonoid with zero and ⊴ be a suitable partial order on the monoid.Then (toDeriv ∘ take𝑘 ∘ filter∩𝐷 ∘ sort⊴)(𝑅wt B ℎ−1(𝑤))

solves the 𝑘-best parsing problem for (𝐺,wt) and a word 𝑤.

ConjectureThe restrictions are not problematic in practice.

refined and implemented by T. Ruprecht (in his master thesis)

he currently investigates practical viability

Thank you for your attention.

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 12 / 13

References

Chomsky, N. and M. P. Schützenberger (1963). “The algebraic theory of context-freelanguages”. doi: 10.1016/S0049-237X(09)70104-1.

Denkinger, T. (2017). “Chomsky-Schützenberger parsing for weighted multiple context-freelanguages”. doi: 10.15398/jlm.v5i1.159.

Huang, L. and D. Chiang (2005). “Better k-best Parsing”.Hulden, M. (2011). “Parsing CFGs and PCFGs with a Chomsky-Schützenberger

Representation”. doi: 10.1007/978-3-642-20095-3_14.Seki, H., T. Matsumura, M. Fujii, and T. Kasami (1991). “On multiple context-free grammars”.

doi: 10.1016/0304-3975(91)90374-B.Yoshinaka, R., Y. Kaji, and H. Seki (2010). “Chomsky-Schützenberger-type characterization of

multiple context-free languages”. doi: 10.1007/978-3-642-13089-2_50.

T. Denkinger: Practical problems with CS-parsing for wMCFGs WATA, Leipzig, 2018-05-23 13 / 13