Upload
mediaeval2012
View
778
Download
0
Embed Size (px)
DESCRIPTION
Citation preview
1
Nicola Orio – University of Padova
Cynthia C. S. Liem – Delft University of Technology
Geoffroy Peeters – UMR STMS IRCAM-CNRS, Paris
Markus Schedl – Johannes Kepler University, Linz
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task
Multimodal Music Tagging Task
2
Multimodal music tagging
• Definition• Songs of a commercial music library need to be categorized
according to their usage in TV and radio broadcasts (e.g. soundtracks, jingles)
• Practical motivation• The search for suitable music for video productions is a
major activity for professionals and lay users alike• Collaborative filtering systems are taking their role• Notwithstanding their known limitations: long-tail, cold start…
• Annotating professional music libraries is another important professional activity
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 3
Human assessment
Different sources of information are routinely exploitedby professionals to overcome limitations of individual media
4
Goals of MusiClef• To focus evaluation on professional application scenarios• Textual description of music items
• To grant replication of experiments and results• Feature extraction phase is crucial – released features
computed with public, open-source library (MIRToolbox)
• To promote the exploitation of multimodal sources of information• Content (audio) + Context (tags & webpages)
• To disseminate music related initiatives• Outside the music information retrieval community
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task
5
Evaluation initiatives – 1
• MIREX (since 2004)• Community-based selection of tasks• Many tasks address audio feature extraction algorithms
• Participants submit algorithms that are run by organizers• Music files are not shared with participants
• Million Song Dataset (since 2011)• Task on music recommendation proposed by organizers• Audio features are computed using proprietary algorithms• Only features are shared with participants
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task
6
Evaluation initiatives – 2
• Quaero-Eval (since 2012)• Tasks agreed with participants• Strategies to grant public access to evaluation results
• Participants run training experiments on a shared repository• Runs on test set made by the organizers
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 7
Test collection – 1• Individual songs of pop and rock music• 1355 songs (from 218 artists)• train (975) and test (380) split
• Social tags• Gathered from Last.fm API
• Multilingual sets of Web pages related to artists+albums• Mined querying Google
• Acoustic features: MFCC (using MIRToolbox) with a window length of 200ms and 50% overlap
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 8
Test collection – 2• Test collection created starting from the “500 Greatest
Songs of All Time” (Rolling Stone)• Expected high number of social tags and web pages
• Ground truth created by experts in the domain• 355 tags selected (167 genre, 288 usage)• Tags associated to less than 20 songs were discarded
• Reference implementation in Matlab• Participants has an example to run a complete experiment• Code for the evaluation made already available
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 9
Evaluation measures
• Standard IR measures
• Accuracy• Precision• Recall• Specificity• F-measure
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 10
Examining tags more closely
• Some tags are more equal than others…
• Thus, we propose to also analyze results employing a higher-level tag categorization
hard rock
countryside
melancholic
travel
bright
ballroom
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 11
Tag categorization – 1• Affective, mood-related aspects:• activity: the amount of perceived music activity,
without implying strong positive or negative affective qualities (e.g. 'fast', 'mellow', 'lazy')
• affective state: affective qualities that can only be connected and attributed to living beings (e.g. 'aggressive', 'hopeful')
• atmosphere: affective qualities that can be connected to environments (e.g. 'chaotic', 'intimate').
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 12
Tag categorization – 2• Situation, time and space aspects of the music:• Physical situation: concrete physical environments
(e.g. 'city', 'night').• Occasion: implications of time and space, typically
connected to social events (e.g. 'holiday', 'glamour').
• Sociocultural genre (e.g. 'new wave', 'r&b', 'punk')• Sound qualities:• timbral aspects (e.g. 'acoustic', 'bright')• temporal aspects (e.g. 'beat', 'groove').
• Other (e.g. 'catchy', 'evocative').
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 13
Reference implementation
• Made in MATLAB and released publicly• Simple and straightforward approaches:• Individual GMMs for audio, user tags, web pages• Tagging process: 1-NN qualification using symmetrized KL
• Scenarios tested:• Audio, user tags, web pages individually• Majority vote• Union
14
Baseline results – 1
• Evaluation of the submitted runs and of the reference implementation• Results with different modalities over the full dataset
strategy accuracy recall precision specificity f-measure
audio 0.894 0.148 0.127 0.939 0.126
tags 0.898 0.061 0.039 0.942 0.037
web pages 0.897 0.050 0.007 0.954 0.011
majority 0.880 0.123 0.086 0.922 0.086
union 0.824 0.240 0.115 0.845 0.134
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task
15
Baseline results – 2
1. activity, energy
2. affective state
3. atmosphere
4. other
5. situation: occasion
6. situation: physical
7. sociocultural genre
8. sound: temporal
9: sound: timbral
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 16
Participation
• Initially a lot of interest - about 8 explicitly interested parties
• But ultimately just one participant (LUTIN UserLab)• Aggregation of estimators
• Currently investigating what happened to the 7 others• So far, it appears ISMIR 2012 was inconveniently close• The 3 other MusiClef co-organizers will discuss this there
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 17
Conclusions
• We established a multimodal music tagging benchmark task• Special effort in facilitating deeper tag analysis• We would like a 2013 multimodal music benchmark task• Depending on survey input• Depending on your input
18
For contact and more information: [email protected]
MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task
Thank you for your attention!