SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ SLAM 2015

SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents

SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio ContentsDamiano Spina, Johanne R. Trippas, Lawrence Cavedon, Mark Sanderson

1

An Extreme Example: Discussing about Merengue (Spanish)

{dance, egg, whip, Terpsichore, Latin, America, white, dessert}{dance, Terpsichore, Latin, America}{dessert, whip, white, egg}What is the dialogue about?Not considering speakersConsidering speakersVS.

2

HypothesisConsidering information about speakerswhich words/fragments correspond to each speakerwould improve topic discovery

3

Example: Topic Discovery for Recommendation

{dance, Terpsichore, Latin, America}{dessert, whip, white, egg}More Like This

More content about dance

More content about desserts

4

Topic Discovery in Multi-Speaker Audio Contents: ApplicationsMulti-Speaker Audio Contents:Podcasts (news, shows, interviews, etc.)MeetingsTV programs

Applications:Content-based Recommendation: more like this

ClusteringGroup search results according to topicsE.g., Search Result Presentation

5

Research QuestionWhat is the impact in terms of effectiveness of adding speaker information to a topic model when compared to traditional approaches (i.e., LDA)?

6

Topic Discovery

[Image from Blei, D. Probabilistic Topic Models, Communication of the ACM, 2012]Distribution of topics over wordsDistribution of topics over documents

7

Topic Discovery vs. Topic SegmentationTopic DiscoveryTopic SegmentationCharacterizes how a conversation evolves over time in terms of topics1 document ~ sequence of topicsCharacterizes documents according to topics1 document ~ distribution of topicst1t3t2t3t2t1time

t1 t2 t3

8

Topic Discovery vs. Topic SegmentationTopic DiscoveryTopic SegmentationNot using speaker informationLatent Dirichlet Allocation (LDA) [Blei et al., 2003]TextTiling [Hearst, 1997]

[Purver et al. 2006]Using speaker information?SITS [Nguyen et al., 2012]

9

Topic Discovery vs. Topic SegmentationTopic DiscoveryTopic SegmentationNot using speaker informationLatent Dirichlet Allocation (LDA) [Blei et al., 2003]TextTiling [Hearst, 1997]

[Purver et al. 2006]Using speaker informationSpeakerLDASITS [Nguyen et al., 2012]

RQ

RQ'

10

Proposed Approach: SpeakerLDA

Split documents (D) according to speakers (S)Run LDACombine topic distributions obtained for each speakers pseudo-document

11

Proposed Approach: SpeakerLDA

12

Evaluation FrameworkTopic models are typically evaluated by

computing intrinsic metrics (e.g., perplexity) of the the model in an unseen set of documents or

being applied to external information access tasks (e.g., topic detection as a clustering task)Needs manually annotated ground truthOne possible measure: Precision/Recall of clustering relationships

13

Evaluation Framework IIIs there any test collection suitable for measuring differences between our approach and existent topic models?

Must satisfy following conditionsEach topic is discussed in two or more documents

Include spoken documents with two or more speakers

The AMI Corpus satisfies both conditions!

14

The AMI Corpus

Augmented Multi-Party Interaction (AMI) Corpus

100 hours of recorded audio

More than 100 meetings with multiple speakers (generally 4)

Real and elicited scenario-driven meetingsSpeakers play different roles:Interface designer, project manager, industrial designer, marketing

Manual transcriptions, including speaker segmentation

Transcripts segmented according to topics and subtopics

15

16

Generating a Gold Standard for Topic Discovery

17

Work in ProgressCompare the effectiveness of SpeakerLDA vs. LDA (and vs. topic segmentation approaches)Extrinsic Evaluation: compare system outputs to clustering gold standard

18

AMI Corpus Topic Segmentation annotations as clustering gold standard

Varying initial number of topics

Considering the n most frequent topics in the topic-document distribution for topic assignment

19

Work in ProgressCompare the effectiveness of SpeakerLDA vs. LDA (and vs. topic segmentation approaches)Extrinsic Evaluation: compare system outputs to clustering gold standard

Challenge: How to define a valid clustering gold standard from topic segmentation annotations?

Opportunity: Compare system output to topic distribution gold standard.Generate distributions from annotated segments

20

{closing=0.09, opening=0.03, components...=0.21, discussion=0.06, industrial...=0.21, interface=0.21, marketing...=0.20}Gold topic distribution for the meeting IS1008c:

21

ConclusionsWe propose SpeakerLDA, a topic model that takes into account speaker information to discover what a set of audio documents (such as podcasts) is aboutIt can be used for clustering search results or content-based recommendation (more like this)We are currently investigating how to generate a clustering gold standard from topic segmentation annotations in the AMI CorpusEvaluate topic models by comparing against a topic distribution gold standard?

22

Thank you!

- For dessert we have...'Merengue'!

23

SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio ContentsDamiano Spina, Johanne R. Trippas, Lawrence Cavedon, Mark Sanderson

@damiano10

[email protected]

24

Science

SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ SLAM 2015