View
24
Download
1
Embed Size (px)
Citation preview
Graph Structure
・ Use search graph in phrase-based model ・ At weighted acyclic directed graph G < Ф,V,E,s,g,> Ф : phrase pair sets =feature vector h( ・ ) ・ weight V: vertex partial hypotheses E:edges weight of route E ⊆ V×V× Ф×A A: weight sets
Graph Structure
• out()= edge sets which go out from vertex • in() = : edge sets which head to vertex ->Phrase pairs are linked by <out(), in()>At figure 5.8, phrase pair <へ行った , I went to> is linked by out() = <-----,0,<s>> and in()=<-- ・・・ ,9,went to>
𝑣
𝑣
Graph Structure
• If Ѱ=(, ,…, ): rout from start to any vertexs, head()=tail(), then
Source language phrase sets: Target language phrase sets: Route weight: =
Graph Structure
• In Fig.5.8, for the route
-> the parallel of word sets of source language 「行った」「へ」「領事館」 is “He went to the consulate”
Start
<行った ,He went>
<へ ,to><領事館 ,
the consulate>
Semiring
• set R equipped with two binary operations addition“ + ” and multiplication “ × ”
• Associative: a+(b+c)=(a+b)+c, a×(b×c)=(a×b)×c• Commutative: a+b=b+a• Distributional: a×(b+c)=(a×b)+(a×c)• Additive inverse, multiplicative inverse 0+a=a+0=a; 1×a=a×1=a; 0×a=a×0=0 are not defined
Semiring
• In Table 5.1, tropical semiring is used to solve maximization problem for route weight in decoder
A ⊕ ⊗
Tropical max + ー 0
Semiring
• In weight directed graph G, for a rout from starting point to ending point of source language input f is Ѱ=
• Score of Ѱ = product of partial route = -> Problem which maximize this score is max⊗()= ⊕⊗()
A ⊕ ⊗Tropical max + ー 0
Semiring
• In Fig.5.7,line 11 Q(+1,)max additive operation ⊕ is implemented for each vertex tail(e)=s of G• As semiring sastifies distributional feature-> weight of any vertexs V is ⊕⊗()=⊗
Semiring
• Forward-backward algorithm for finding maximum of route weight in graph structure
• topological order(G): list of vertexs of graph G which arranged in topological order
• external variable
Semiring
FORWARD(G)• topological order(G), ein()⊗
⊕ Start
tail(e)(e)
(e) ⊗
Semiring
BACKWARD(G)• inversetopological order(G), e()⊗
⊕ Goal
(e)
(e) ⊗
head(e)
Semiring
In problem which choose the optimum translation from search space expressed by weighted directed graph G Tropical semiring + Forward algorithm->Viterbi semiring
k-best
• Besides forward-backward algorithm, k-best algorithm is used to optimize route weight
• Dijkstra’s algorithm: for single source shortest path problem
• Eppstein’s algorithm: for heaping multiple paths efficiently
k-best
• Assume problem satisfies Tropical semiring and backward algorithm• Calculate and choose max (weight )• Fig.5.10 algorithm ・ cand: priority queue ・ < , s>: partial route ・ < ,>: partial route whose vertex and edgeout() ・ D: set of < ,>
k-best
• k=1: Initialized cand
• Optimize weight of partial route and whole route
Whole route
D
cand
optimal
get out < , s>,register D Choose and out() insert to cand
heap ( ・ ) to get optimal
k time
Limitation of Search Space
• If search space is big->any sort can be forgiven->calculation amount of decode algorithm become massive->limitation is necessary: ・ Distortion limit, constraint ・ Reordering limit, constraint
Distortion Constraint
• Upper limit setting d for distance between phrase pair d The purpose is making model score small if model distorted lead to penalty become bigFor language pair which do not have big sort, distortion constraint reach good efficiencyIf d=0: no skip, translate from left to right smoothly->monotone translation
Distortion Constraint• Constraint for case when have partial phrases do not reach the ending point : position of the first phrase of source language : the first position of translated phraseIf (), add d・ IBM Constraint
�̈� 𝑠𝑡𝑎𝑟𝑡𝑘 𝑒𝑛𝑑𝑘・・・
phrase
No need to exam
Beam Search
・ Prune disused partial hypothesis and pay attention only partial hypothesis with high score for computational reduction・ Group of vertexs of search graph and prune partial hypothesis which has low score
Beam Search・ Group of vertexs of search graph and prune partial hypothesis which has low score
Partial hypothesis pruned Partial hypothesis chose
Beam Search
Some kinds of grouping: - Cover vector grouping - Radix grouping - Beam width pruning - Histogram pruning
Heuristic Function
• Prevent partial hypothesis which has not been translated yet from pruning• Give predicted score for the rout and learn by A* search so that rout score get the maximum• ->can reduce search error
Pre-reordering Method
Translation between languages which has significantly different grammatical structure• Pre-reordering rule• Pre-reordering model• Pre-reordering learning
Pre-reordering Rule
• Based on tree from syntactic analysis, reorder to target language word order• Head-driven phrase structure grammar(HPSG)’s rule: - Syntactic anlysis - Move the subjects back
Pre-reordering Model
• Source languages must have syntactic analysis tool and morphological analysis tool• Bilingual data are necessary• Probability value of pre-reordering patterns obtained will be estimated by maximum-likelihood estimation(MLE)• Choose the suitable pre-reordering patterns based on reordering part of speech from morphological analysis, or clustering word class
Pre-reordering Learning
• For language pairs without any syntactic analysis tools and morphological analysis tools• Provisional tree structure automatically generated from syntactic analysis result• Divide tree factors to 2 labels: reordering label [X],and no-reordering label <X>• Use linear ordering problem(LOP) to formulate reordering model to find the approximate solution and build the parse tree