Upload
multimediaeval
View
49
Download
0
Tags:
Embed Size (px)
Citation preview
1
The NNI QbE-STD System for MedialEval 2014
Peng Yang1, Haihua Xu2, Xiong Xiao2, Lei Xie1, Cheung-Chi Leung3
Hongjie Chen1, Jia Yu1, Hang Lv1, Lei Wang3, Su Jun Leow2
Bin Ma3, Eng Siong Chng1, Haizhou Li2,3
1Northwestern Polytechnical University, Xi’an, China2Nanyang Technological University, Singapore3Institute for Infocomm Research, A STAR, Singapore
Presented by Haihua XuTemasek Laboratories@NTU, Singapore
NNI QbE-STD system, MedialEval 2014 Workshop, Barcelona
2
System Diagram Two groups of subsystems are used:• Subsequence DTW-based template matching on Gaussian/phone posteriorgram and
bottleneck features. • Symbolic search (SS) using phone tokenizer and weighted finite state transducer
(WFST)
NNI QbE-STD system, MedialEval 2014 Workshop, Barcelona
3
TokenizersTokenizers are used to convert the audio signal into • posteriorgram or bottleneck features for DTW based systems• phone sequences/lattices for SS systems
NNI QbE-STD system, MedialEval 2014 Workshop, Barcelona
4
DTW-based Systems
• Full sequence matching1: conventional subsequence DTW. Good for type 1 queries.
• Used partial matching for type 2&3 queries. • Use partial feature segment of query for matching• Segments are 600ms long and shifted by 50ms. • Improved performance for Type 3 queries.
• 9 DTW systems• 5 using full matching• 4 using partial matching
1Yang P. et al, “Intrinsic spectral analysis based on temporal context features for query-by-example spoken term detection ”, in Proc. INTERSPEECH, 2014
NNI QbE-STD system, MedialEval 2014 Workshop, Barcelona
5
Why Symbolic Search (SS)• DTW is effective1, but it is
• computationally expensive and difficult to be indexed,• not easy to handle inexact match.
• Symbolic search allows indexing and fast search, e.g. using weighted finite state transducer (WFST).
1Anguera X., Rodrigues-Fuentes L.J., Szoke I., Buzo A., and Metze F., “Query by example search on speech at mediaeval 2014”, in Working Notes Proceedings of the Mediaeval 2014 workshop, Barcelona, Spain, Oct. 16-17
NNI QbE-STD system, MedialEval 2014 Workshop, Barcelona
6
Symbolic Search System
NNI QbE-STD system, MedialEval 2014 Workshop, Barcelona
• Limitations of symbolic search for QbE-STD:• Must use phone recognizers of other languages for
tokenization poor symbolic representation.• Inconsistent phone representation between query
and search audio.
7NNI QbE-STD system, MedialEval 2014 Workshop, Barcelona
Limitation of Conventional Symbolic Search
• Full – Full symbolic search method• pMiss – Miss rate• pFA – False alarm rate• ATWV – Actual Term Weighted Value
As query length increases,
• Missing rate approaches 100%
• False alarm rate approaches 0
• ATWV approaches 0
8NNI QbE-STD system, MedialEval 2014 Workshop, Barcelona
Partial Phone Sequence Matching
Partial Matching Steps
• If a query phone hypothesis is longer than 6, get all partial sequences of the hypothesis.
• Use all the unique partial sequences to search.
• Search results are pooled and all treated as the match of the query.
• Score normalization is applied, and decision is made.
• High missing rate of long queries can be reduced by simply shorten the query representation.
• Rationale: let the system return something first, and then decide which is true match.
9NNI QbE-STD system, MedialEval 2014 Workshop, Barcelona
Effectiveness of Partial Phone Sequence Matching
Full – Full symbolic search methodPartial – Partial symbolic search methodpMiss – Miss ratepFA – False alarm rateATWV – Actual Term Weighted Value
For queries longer than 6 phones:
• Missing rate reduced
• False alarm increased
• ATWV increased.
If beta is not 66.7, the best trade-off point of pMiss and pFA will change.
10
Results
NNI QbE-STD system, MedialEval 2014 Workshop, Barcelona
• For type 1 query, the partial SS method is obviously worse than DTW method.
• But for type 2 and 3 queries, the partial SS method is comparable with DTW one.
• For type 3 query, the partial SS method is significantly better than the DTW one in terms MTWV.
• The two methods are very complementary.
11
Conclusion
NNI QbE-STD system, MedialEval 2014 Workshop, Barcelona
We have described the NNI system for the QUESST 2014 Task
• DTW based subsystem• Symbolic search subsystem
• Why conventional SS system is not working, especially for long queries• Partial phone sequence SS method is proposed
• The NNI system results are reported
In future, research will be focused on reducing the false alarms introduced by the partial matching method.
12
Thanks !
NNI QbE-STD system, MedialEval 2014 Workshop, Barcelona