Upload
seokhwan-kim
View
207
Download
0
Tags:
Embed Size (px)
Citation preview
WIKIPEDIA-BASED KERNELS
FOR DIALOGUE TOPIC TRACKING
Seokhwan Kim, Rafael E. Banchs, Haizhou Li
Human Language Technology Department
Institute for Infocomm Research (I2R)
6th May 2014
ICASSP, Florence, Italy
Contents
• Introduction
• Problem Definition
• Method
• Evaluation
• Conclusions
Pg 2
Contents
• Introduction
• Problem Definition
• Method
• Evaluation
• Conclusions
Pg 3
Motivation
• Spoken Dialogue Systems
– Next-generation User Interface
– The most natural way for human-human communication
• Single-task dialogues
– Most previous work focuses on single target task
• Eg. Flight Reservation, Bus Information Guide, Restaurant Booking
– Cause limitations in practical uses
• Multi-task dialogues
– [Lin et al. 1999, Ikeda et al. 2008, Celikyilmaz et al. 2011]
– Selecting the most probable system at each turn
– Each system is independently built and operated from others
Pg 4
Related Work
• Text categorization-based Dialogue Topic Identification– [Nakata et al., 2002; Lagus&Kuusisto, 2002; Adams&Martell, 2008]
– Differences from written texts
• Determinations of topics
– User’s intentions
– System’s decisions
• Available features
– Unable to see the future turns
• Knowledge-based Dialogue Topic Suggestion
– External Knowledge Sources
• Eg. Domain Models, Heuristics, Agendas• [Roy&Subramaniam, 2006; Young et al., 2007; Bohus&Rudnicky, 2003; Lee et al. 2008
– Limited flexibility
• To handle user-initiative cases
– High cost
• To build a sufficient amount of resources
Pg 5
Contents
• Introduction
• Problem Definition
• Method
• Evaluation
• Conclusions
Pg 6
Dialogue Topic Tracking
• Subtasks
– Dialogue Segmentation
• Segmenting a session into topically coherent sub-dialogues
– Topic Transition Identification
• Identifying the next topic category at each time of topic transition
Pg 7
Dialogue Topic Tracking
• Example
Pg 8
Contents
• Introduction
• Problem Definition
• Method
• Evaluation
• Conclusions
Pg 9
Wikipedia-based Kernel Method
• Vector Space Model
– The simplest approach to represent features for supervised machine
learning methods
– An instance for each turn A weighted term vector
– Lack of semantic or domain-specific aspects
• Each word is considered as an independent and identical unit
Pg 10
Wikipedia-based Kernel Method
• Wikipedia for Dialogue Topic Tracking
– As an external knowledge source
– Without significant effort for building resources
– Previous work
• [Breuing et al., 2011; Wilcock, 2012]
• Focusing only on a single type of information from Wikipedia
• Wikipedia-based Kernel Method
– Aiming at incorporating various knowledge from Wikipedia
– To map the data into a higher dimensional feature space
• Vector Space Extension
• Vector Transformation
Pg 11
Wikipedia-based Kernel Method
• Vector Extension
Pg 12
Term Vector Concept Vector
…U: ---------------------------S: ---------------------------U: ---------------------------S: ---------------------------U: ---------------------------S: ---------------------------
…
x
β1β2
Β|D|
d1
d2
d|D|
⁞
Wikipedia-based Kernel Method
• Vector Transformation
– Each extended vector is transformed into a new space
– Transformation Matrix S
– s(di, dj) is the relatedness between di and dj
– Update of Concept Vector Values
Pg 13
Wikipedia-based Kernel Method
Pg 14
Measures of Contextual Relatedness
• How to compute s(di, dj)?
• Category Relatedness
– Based on hierarchical structures of Wikipedia categories
• depth(d): the length of the path from the root node to d
• lcs(di, dj): the least common subsume of the two articles in the hierarchy
Pg 15
Measures of Contextual Relatedness
• Category Overlap Score
– Based on the ratio of common categories of two concepts
– By Jaccard’s coefficient
• Contents Similarity
– Based on the cosine similarity between term vectors from the body texts
Pg 16
Measures of Contextual Relatedness
• Co-occurrence Frequency
– To represent the discourse relatedness obtained from Wikipedia
– Assumption
• The more frequently the mentions about two concepts co-occurred
• The more similar aspects both concepts take in dialogue flows
– By normalized point-wise mutual information
Pg 17
Measures of Contextual Relatedness
• Geographical Closeness
– Domain-specific Measure
– Based on the geographic coordinate information of spatial concepts
• Final Score
Pg 18
Measures of Contextual Relatedness
Pg 19
Contents
• Introduction
• Problem Definition
• Method
• Evaluation
• Conclusions
Pg 20
Evaluation
• Dataset
– Dialogue Corpus on Singapore Tour Guide
• Real human-human mixed initiative conversations
• Between guides and tourists
• Stats
– 35 dialogue sessions
– 21 hours
– 19,651 utterances
• Topics
– 1,642 topic segments
– 9 topic categories
» Opening, Closing, Itinerary, Accommodation, Attraction, Food,
Transportation, Shopping, Other
– Wikipedia Collection
• 3,115 articles related to Singapore
• Collected from Wikipedia database dump as of Feb 2013
Pg 21
Evaluation
• Models
– Training Instances
• 8,318 instances for user-turn-level segmentation
• 1,607 instances for dialogue-segment-level topic prediction
– Support Vector Machine (SVM) Models
• BOW: Baseline only with term vector space
• WK0: Extended vector without transformation
• WK1: s(di, dj) = s1(di, dj)
• WK2: s(di, dj) = s1(di, dj) + s2(di, dj)
• WK3: s(di, dj) = s1(di, dj) + s2(di, dj) + s3(di, dj)
• WK4: s(di, dj) = s1(di, dj) + s2(di, dj) + s3(di, dj) + s4(di, dj)
• WK5: s(di, dj) = s1(di, dj) + s2(di, dj) + s3(di, dj) + s4(di, dj) + s5(di, dj)
• Metrics
– Five-fold Cross-validation
– Segmentation: P/R/F
– Topic Prediction: Accuracy
Pg 22
Evaluation
• Comparison of dialogue topic tracking performances
Pg 23
Evaluation
• Distributions of errors on the cascaded results with WK5
– 71.4% of errors result from segmentation
– 60.0% of errors occurred for system-initiative cases
Pg 24
Contents
• Introduction
• Problem Definition
• Method
• Evaluation
• Conclusions
Pg 25
Conclusions
• Summary
– Wikipedia-based Kernel Method for Dialogue Topic Tracking
– To incorporate various types of information from Wikipedia
– Experimental results show the merits of our proposed approach in
mixed-initiative dialogues
• Ongoing Work
– Using more various types of knowledge from Wikipedia
– To be presented at ACL 2014
• A Composite Kernel Approach for Dialog Topic Tracking with Structured
Domain Knowledge from Wikipedia
Pg 26
References
• B. Lin, H. Wang, and L. Lee, “A distributed architecture for cooperative spoken dialogue agents with coherent dialogue state and history,” in Proceedings
of the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1999.
• S. Ikeda, K. Komatani, T. Ogata, H. G. Okuno, and H. G. Okuno, “Extensibility verification of robust domain selection against out-of-grammar utterances
in multidomain spoken dialogue system.,” in Proceedings of the 9th Annual Conference of the International Speech Communicatiuon Association
(INTERSPEECH), 2008, pp. 487–490.
• A. Celikyilmaz, D. Hakkani-T¨ur, and G. T¨ur, “Approximate inference for domain detection in spoken language understanding.,” in Proceedings of the
12th Annual Conference of the International Speech Communication Association (INTERSPEECH), 2011, pp. 713–716.
• T. Nakata, S. Ando, and A. Okumura, “Topic detection based on dialogue history,” in Proceedings of the 19th international conference on Computational
linguistics (COLING), 2002, pp. 1–7.
• K. Lagus and J. Kuusisto, “Topic identification in natural language dialogues using neural networks,” in Proceedings of the 3rd SIGdial workshop on
Discourse and dialogue, 2002, pp. 95–102.
• P. H. Adams and C. H. Martell, “Topic detection and extraction in chat,” in Proceedings of the 2008 IEEE International Conference on Semantic
Computing, 2008, pp. 581–588.
• S. Roy and L. V. Subramaniam, “Automatic generation of domain models for call centers from noisy transcriptions,” in Proceedings of the 21st
International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, 2006, pp. 737–744.
• S. Young, J. Schatzmann, K. Weilhammer, and H. Ye, “The hidden information state approach to dialog management,” in Proceedings of the
International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2007.
• D. Bohus and A. Rudnicky, “Ravenclaw: dialog management using hierarchical task decomposition and an expectation agenda,” in Proceedings of the
European Conference on Speech, Communication and Technology, 2003, pp. 597–600.
• C. Lee, S. Jung, and G. G. Lee, “Robust dialog management with n-best hypotheses using dialog examples and agenda.,” in Proceedings of the 46th
Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2008, pp. 630–637.
• G. Salton, A. Wong, and C.S. Yang, “A vector space model for automatic indexing,” Communications of the ACM, vol. 18, no. 11, pp. 613–620, 1975.
• G.Wilcock, “Wikitalk: a spoken wikipedia-based opendomain knowledge access system,” in Proceedings of the Workshop on Question Answering for
Complex Domains, 2012, p. 5770.
• A. Breuing, U. Waltinger, and I. Wachsmuth, “Harvesting wikipedia knowledge to identify topics in ongoing natural language dialogs,” in Proceedings of
the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011, pp. 445–450.
• P. Wang and C. Domeniconi, “Building semantic kernels for text classification using wikipedia,” in Proceedings of the 14th ACM SIGKDD international
conference on Knowledge discovery and data mining, 2008, pp. 713–721.
• Z. Wu and M. Palmer, “Verbs semantics and lexical selection,” in Proceedings of the 32nd annual meeting on Association for Computational Linguistics,
1994, pp.133–138.
• C. C. Chang and C. J. Lin, “Libsvm: a library for support vector machines,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 2, no.
3, pp. 27, 2011.