Upload
neorah
View
14
Download
0
Embed Size (px)
DESCRIPTION
ended. VP. with. PP. loss. NP. of. contract. PP. NP. cents. NP. loss. NP. Speech Recognizer (Baseline LM). Rescoring (New LM). 100 Best Hyp. Speech. 1 hypothesis. Smoothing Issues in the Strucutred Language Model. - PowerPoint PPT Presentation
Citation preview
This research was partially supported by the U.S. National
Science Foundation via STIMULATE grant No. 9618874
Introduction
The Structured Language Model(SLM)
- An attempt to exploit the syntactic structure of natural language
- Consists of a predictor, a tagger and a parser
- Jointly assigns a probability to a word sequence and parse structure
- Still suffers from data sparseness problem, Deleted Interpolation(DI) has been used
Use of Kneser-Ney smoothing to improve the performance
The Structured Language Model(SLM)
Example of a Partial Parse
Probability estimation in the SLM
Kneser-Ney Smoothing
Experiment Result
N-Best Rescoring
Test Set PPL as a Function of
ASR WER for SWB
Smoothing Issues in the Strucutred Language ModelWoosung Kim, Sanjeev Khudanpur, and Jun Wu
The Center for Language and Speech Processing, The Johns Hopkins University {woosung, sanjeev, junwu}@clsp.jhu.edu
The Center for Language and Speech
Processing
The Johns Hopkins University
3400 N. Charles Street, Barton Hall
Baltimore, MD 21218
Concluding Remarks
• KN smoothing of the SLM shows modest but consistent improvements
– both PPL and WER
• Future Work
– SLM with Maximum Entropy Models
– But Maximum Entropy Model training requires heavy computation
• Fruitful results in the selection of features for the Maximum Entropy Models
Smoothing 3gram SLM InptlDeleted Intpl 39.1% 38.6% 38.2%
KN-BO(Predictor) 38.3% 37.7% 37.5%KN-BO(All Modules) 38.3% 37.8% 37.7%
Nonlinear Intpl 38.1% 37.6% 37.5%NI w/Deleted Est. 38.3% 37.7% 37.5%
50
70
90
110
130
150
170
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Interpolation Weight()
Te
st S
et P
PL
KN BO for WSJ
KN NI for WSJ
KN BO for SWB
KN NI for SWB
Language Model Perplexity
WSJ Corpus SWBEM Iter. 3gram SLM Intpl Smoothing EM Iter. 3gram SLM Intpl
EM0
EM370 73
72
67
66
Deleted
Intpl
EM0
EM3
162 166
154
149
146
EM0
EM364 64
63
60
60
KN-BO
(Predictor)
EM0
EM3
152 166
149
139
137
EM0
EM364 64
63
60
60
KN-BO
All Modules
EM0
EM3
152 170
153
141
140
EM0
EM365 65
65
61
61
Nonlinear
Intpl
EM0
EM3
146 152
141
132
131
EM0
EM364 63
64
60
60
NI w/Deleted Estimation
EM0
EM3
145 150
141
131
130
Item WSJ SWB
Word Voc.
Part-Of-Speech Tags
Non-Terminal Tags
Parser Operations
10K(open)
40
54
136
21K(closed)
49
64
112
LM Dev. Set
LM Check Set
LM Test Set
ASR Test Set
885K
117K
82K
-
2.07M
216K
20K
20K
Database Size Specifications(in Words)
• Two corpora – Wall Street Journal(WSJ) Upenn Treebank
• for LM PPL test– Switchboard(SWB)
• For ASR WER test as well as LM PPL• Tokenization
– Original SWB tokenization Examples : They’re, It’s, etc. Not Suitable for syntactic analysis
– Treebank tokenizationExamples : They ‘re, It ‘s, etc.
Speech Recognizer(Baseline LM) 100 Best Hyp
Speech
Rescoring(New LM)
1 hypothesis
)T,.,|(T
),.,.|.(
).,,.,|(T,W
1
12
11221
)(
iiii
ii
i
n
inn
tagwwP
wtaghtaghtagwP
taghhtaghhwPP predictor
tagger
parser
ij
i
ij
i
S
jii
jiij
ii
jii
jiii
Sii
P
Pρ
ρwPwP
T
T
)T,(W
)T,(W)T,(W
)T,(W)T,W|()W|(
where
11
LM PPL
Parse tree probability
The contract ended with a loss of 7 cents after
DT NN VBD IN DT NN IN CD NNS
cents NP
of PP
loss NP
lossNP
contractNP
ended VPwith PP
Backoff
Nonlinear Interpolation
00
0
0
0
)~,,(:~)~,,(
),,(),|(
),|(),(
),,(
),,(),(
),,(
),|(
wvuNw
wvn
wvnvuwβ
otherwisevuwβvuN
vund
wvuNifvuN
dwvuN
vuwP
w
wvn
wvnvuwβ
vuwβvuN
vund
vuN
dwvuNvuwP
~0
0
0
)~,,(
),,(),|(ˆ
),|(ˆ),(
),,(
),(
}0,),,(max{),|(
0),,~(:~
0 1),,( wherewvuNu
wvn