Wikipedia-based Kernels for Dialogue Topic Tracking

WIKIPEDIA-BASED KERNELS

FOR DIALOGUE TOPIC TRACKING

Seokhwan Kim, Rafael E. Banchs, Haizhou Li

Human Language Technology Department

Institute for Infocomm Research (I2R)

6th May 2014

ICASSP, Florence, Italy

Contents

• Introduction

• Problem Definition

• Method

• Evaluation

• Conclusions

Pg 2

Contents

• Introduction


• Method

• Evaluation

• Conclusions

Pg 3

Motivation

• Spoken Dialogue Systems

– Next-generation User Interface

– The most natural way for human-human communication

• Single-task dialogues

– Most previous work focuses on single target task

• Eg. Flight Reservation, Bus Information Guide, Restaurant Booking

– Cause limitations in practical uses

• Multi-task dialogues

– [Lin et al. 1999, Ikeda et al. 2008, Celikyilmaz et al. 2011]

– Selecting the most probable system at each turn

– Each system is independently built and operated from others

Pg 4

Related Work

• Text categorization-based Dialogue Topic Identification– [Nakata et al., 2002; Lagus&Kuusisto, 2002; Adams&Martell, 2008]

– Differences from written texts

• Determinations of topics

– User’s intentions

– System’s decisions

• Available features

– Unable to see the future turns

• Knowledge-based Dialogue Topic Suggestion

– External Knowledge Sources

• Eg. Domain Models, Heuristics, Agendas• [Roy&Subramaniam, 2006; Young et al., 2007; Bohus&Rudnicky, 2003; Lee et al. 2008

– Limited flexibility

• To handle user-initiative cases

– High cost

• To build a sufficient amount of resources

Pg 5

Contents

• Introduction


• Method

• Evaluation

• Conclusions

Pg 6

Dialogue Topic Tracking

• Subtasks

– Dialogue Segmentation

• Segmenting a session into topically coherent sub-dialogues

– Topic Transition Identification

• Identifying the next topic category at each time of topic transition

Pg 7

Dialogue Topic Tracking

• Example

Pg 8

Contents

• Introduction


• Method

• Evaluation

• Conclusions

Pg 9

Wikipedia-based Kernel Method

• Vector Space Model

– The simplest approach to represent features for supervised machine

learning methods

– An instance for each turn A weighted term vector

– Lack of semantic or domain-specific aspects

• Each word is considered as an independent and identical unit

Pg 10


• Wikipedia for Dialogue Topic Tracking

– As an external knowledge source

– Without significant effort for building resources

– Previous work

• [Breuing et al., 2011; Wilcock, 2012]

• Focusing only on a single type of information from Wikipedia

• Wikipedia-based Kernel Method

– Aiming at incorporating various knowledge from Wikipedia

– To map the data into a higher dimensional feature space

• Vector Space Extension

• Vector Transformation

Pg 11


• Vector Extension

Pg 12

Term Vector Concept Vector

…U: ---------------------------S: ---------------------------U: ---------------------------S: ---------------------------U: ---------------------------S: ---------------------------

…

x

β1β2

Β|D|

d1

d2

d|D|

⁞


• Vector Transformation

– Each extended vector is transformed into a new space

– Transformation Matrix S

– s(di, dj) is the relatedness between di and dj

– Update of Concept Vector Values

Pg 13


Pg 14

Measures of Contextual Relatedness

• How to compute s(di, dj)?

• Category Relatedness

– Based on hierarchical structures of Wikipedia categories

• depth(d): the length of the path from the root node to d

• lcs(di, dj): the least common subsume of the two articles in the hierarchy

Pg 15


• Category Overlap Score

– Based on the ratio of common categories of two concepts

– By Jaccard’s coefficient

• Contents Similarity

– Based on the cosine similarity between term vectors from the body texts

Pg 16


• Co-occurrence Frequency

– To represent the discourse relatedness obtained from Wikipedia

– Assumption

• The more frequently the mentions about two concepts co-occurred

• The more similar aspects both concepts take in dialogue flows

– By normalized point-wise mutual information

Pg 17


• Geographical Closeness

– Domain-specific Measure

– Based on the geographic coordinate information of spatial concepts

• Final Score

Pg 18


Pg 19

Contents

• Introduction


• Method

• Evaluation

• Conclusions

Pg 20

Evaluation

• Dataset

– Dialogue Corpus on Singapore Tour Guide

• Real human-human mixed initiative conversations

• Between guides and tourists

• Stats

– 35 dialogue sessions

– 21 hours

– 19,651 utterances

• Topics

– 1,642 topic segments

– 9 topic categories

» Opening, Closing, Itinerary, Accommodation, Attraction, Food,

Transportation, Shopping, Other

– Wikipedia Collection

• 3,115 articles related to Singapore

• Collected from Wikipedia database dump as of Feb 2013

Pg 21

Evaluation

• Models

– Training Instances

• 8,318 instances for user-turn-level segmentation

• 1,607 instances for dialogue-segment-level topic prediction

– Support Vector Machine (SVM) Models

• BOW: Baseline only with term vector space

• WK0: Extended vector without transformation

• WK1: s(di, dj) = s1(di, dj)

• WK2: s(di, dj) = s1(di, dj) + s2(di, dj)

• WK3: s(di, dj) = s1(di, dj) + s2(di, dj) + s3(di, dj)

• WK4: s(di, dj) = s1(di, dj) + s2(di, dj) + s3(di, dj) + s4(di, dj)

• WK5: s(di, dj) = s1(di, dj) + s2(di, dj) + s3(di, dj) + s4(di, dj) + s5(di, dj)

• Metrics

– Five-fold Cross-validation

– Segmentation: P/R/F

– Topic Prediction: Accuracy

Pg 22

Evaluation

• Comparison of dialogue topic tracking performances

Pg 23

Evaluation

• Distributions of errors on the cascaded results with WK5

– 71.4% of errors result from segmentation

– 60.0% of errors occurred for system-initiative cases

Pg 24

Contents

• Introduction


• Method

• Evaluation

• Conclusions

Pg 25

Conclusions

• Summary

– Wikipedia-based Kernel Method for Dialogue Topic Tracking

– To incorporate various types of information from Wikipedia

– Experimental results show the merits of our proposed approach in

mixed-initiative dialogues

• Ongoing Work

– Using more various types of knowledge from Wikipedia

– To be presented at ACL 2014

• A Composite Kernel Approach for Dialog Topic Tracking with Structured

Domain Knowledge from Wikipedia

Pg 26

Thank You

Pg 27

Contact: [email protected]

http://www.i2r.a-star.edu.sg/

http://www.i2r.a-star.edu.sg/

References

• B. Lin, H. Wang, and L. Lee, “A distributed architecture for cooperative spoken dialogue agents with coherent dialogue state and history,” in Proceedings

of the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1999.

• S. Ikeda, K. Komatani, T. Ogata, H. G. Okuno, and H. G. Okuno, “Extensibility verification of robust domain selection against out-of-grammar utterances

in multidomain spoken dialogue system.,” in Proceedings of the 9th Annual Conference of the International Speech Communicatiuon Association

(INTERSPEECH), 2008, pp. 487–490.

• A. Celikyilmaz, D. Hakkani-T¨ur, and G. T¨ur, “Approximate inference for domain detection in spoken language understanding.,” in Proceedings of the

12th Annual Conference of the International Speech Communication Association (INTERSPEECH), 2011, pp. 713–716.

• T. Nakata, S. Ando, and A. Okumura, “Topic detection based on dialogue history,” in Proceedings of the 19th international conference on Computational

linguistics (COLING), 2002, pp. 1–7.

• K. Lagus and J. Kuusisto, “Topic identification in natural language dialogues using neural networks,” in Proceedings of the 3rd SIGdial workshop on

Discourse and dialogue, 2002, pp. 95–102.

• P. H. Adams and C. H. Martell, “Topic detection and extraction in chat,” in Proceedings of the 2008 IEEE International Conference on Semantic

Computing, 2008, pp. 581–588.

• S. Roy and L. V. Subramaniam, “Automatic generation of domain models for call centers from noisy transcriptions,” in Proceedings of the 21st

International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, 2006, pp. 737–744.

• S. Young, J. Schatzmann, K. Weilhammer, and H. Ye, “The hidden information state approach to dialog management,” in Proceedings of the

International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2007.

• D. Bohus and A. Rudnicky, “Ravenclaw: dialog management using hierarchical task decomposition and an expectation agenda,” in Proceedings of the

European Conference on Speech, Communication and Technology, 2003, pp. 597–600.

• C. Lee, S. Jung, and G. G. Lee, “Robust dialog management with n-best hypotheses using dialog examples and agenda.,” in Proceedings of the 46th

Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2008, pp. 630–637.

• G. Salton, A. Wong, and C.S. Yang, “A vector space model for automatic indexing,” Communications of the ACM, vol. 18, no. 11, pp. 613–620, 1975.

• G.Wilcock, “Wikitalk: a spoken wikipedia-based opendomain knowledge access system,” in Proceedings of the Workshop on Question Answering for

Complex Domains, 2012, p. 5770.

• A. Breuing, U. Waltinger, and I. Wachsmuth, “Harvesting wikipedia knowledge to identify topics in ongoing natural language dialogs,” in Proceedings of

the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011, pp. 445–450.

• P. Wang and C. Domeniconi, “Building semantic kernels for text classification using wikipedia,” in Proceedings of the 14th ACM SIGKDD international

conference on Knowledge discovery and data mining, 2008, pp. 713–721.

• Z. Wu and M. Palmer, “Verbs semantics and lexical selection,” in Proceedings of the 32nd annual meeting on Association for Computational Linguistics,

1994, pp.133–138.

• C. C. Chang and C. J. Lin, “Libsvm: a library for support vector machines,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 2, no.

3, pp. 27, 2011.

Software

Wikipedia-based Kernels for Dialogue Topic Tracking